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is a quiet, unshakable force, reminding me that even in the face of life’s greatest challenges, 
we rise, we endure, and we thrive. 


Israel Gbati 


Foreword 


Around the time Israel and I first began working together, we were preparing to showcase our 
non-contact gesture trackpad at the British Invention Show. With only two weeks to get ready, most 
of our time was spent soldering components and assembling mechanical systems for the presentation. 
The firmware, however, was left until the final stretch. With just one day remaining, we braced for an 
all-nighter, taking turns debugging the code with the persistence of parents tending to a restless child. 
In the final hours, with our eyes half-closing, Israel made a critical breakthrough, pulling everything 
together just in time. Though we arrived late to the event, we walked away with the Platinum Award. 


I've had the privilege of working alongside Israel for nearly a decade, acting as his tag team partner 
in this fascinating world of hardware, where electronics and mechanical design harmoniously blend 
with firmware and software to create technology at the cutting edge of innovation. 


Israel truly practices what he preaches. The teachings in this book are actively tested and applied in 
the field to advance technology aimed at improving the quality of life for individuals facing various 
health challenges. There is nothing more relevant today than an engineering text written by an active 
practitioner, especially in the midst of the most fast-paced and dynamic engineering environment in 
human history. What you hold in your hands is the culmination of years of dedicated effort, driven by 
both knowledge and real-world applications that extend far beyond theory into the practical challenges 
of technological entrepreneurship. 


Thope this book serves as a valuable resource and inspiration for all who seek to push the boundaries 
of engineering and technology. 


Georgios Papanikolaou 


Chief Operating Officer and Head of Hardware Engineering, BiostealthAI 
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Preface 


In a tech-driven world where embedded systems power nearly every modern device and innovation, 
the ability to develop efficient and reliable firmware is a prized skill. The journey from writing basic 
code to mastering low-level firmware development can be daunting, but the rewards are substantial. 
Whether it’s a home appliance, an industrial control system, or a sophisticated IoT device, embedded 
systems serve as the silent, hardworking engines behind modern technology. 


This book, Bare-Metal Embedded C Programming, was born out of a desire to help you not only write 
functional firmware but also to deeply understand the underlying mechanisms that govern how 
microcontrollers work at their core. My goal is to take you on an in-depth, technical journey into the 
heart of ARM-based microcontroller firmware development, specifically focusing on the STM32 family. 
This is not a book for the faint of heart, nor is it one for those looking for quick shortcuts. Instead, 
it is designed for individuals who are ready to step away from the comforts of pre-built libraries and 
tools to develop the skills necessary for writing efficient, bare-metal code from scratch. 


So, what exactly is bare-metal programming? Simply put, it’s the art of writing firmware that interacts 
directly with the hardware—without the abstraction layers provided by third-party libraries. This 
approach requires precision, a deep understanding of microcontroller architecture, and the ability to 
read and manipulate registers to achieve the exact behavior you want from your hardware. 


Why I Wrote This Book 


As someone with years of experience in embedded systems development, I’ve often noticed a gap in 
the way firmware development is taught. Many texts and courses focus on high-level development, 
promoting the use of pre-built libraries that abstract away the complexities of hardware interaction. 
While this approach is undoubtedly convenient and practical in many cases, it leaves a void for those 
who truly wish to understand how things work at the lowest level. I believe that understanding the 
“bare-metal” aspect of embedded systems development is essential for becoming a truly skilled 
firmware engineer. 


This book is my effort to fill that gap. Through step-by-step guidance, I'll show you how to build your 
own drivers, manipulate registers, and write code that takes full control of the microcontroller. This 
is not just about learning a new skill—it’s about achieving mastery. 
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Who this book is for 


If you're a developer, engineer, or a student eager to dive deep into the world of microcontroller firmware 
development, this book is for you. You'll find it especially valuable if you're the kind of person who 
prefers to understand what's happening under the hood, rather than relying on copy-paste solutions 
from online forums. Whether you're transitioning from other platforms or seeking to build a strong 
foundation in bare-metal development, this book will give you the hands-on experience you need. 


What This Book Covers 


Chapter 1, Setting Up the Tools of the Trade 


This chapter introduces the essential tools you'll need for development. From navigating datasheets 
to setting up your Integrated Development Environment (IDE), this chapter lays the groundwork for 
everything that follows. 


Chapter 2, Constructing Peripheral Registers from Memory Addresses 


In this chapter, we dive into the core of bare-metal programming. You'll learn how to define and access 
peripheral registers directly from memory addresses, using the official microcontroller documentation 
as your guide. 


Chapter 3, Understanding the Build Process and Exploring the GNU Toolchain 


In this chapter, we take a closer look at the embedded C build process. You'll explore how to compile 
and link code manually using the GNU Toolchain, gaining complete control over how your firmware 
is created. 


Chapter 4, Developing the Linker Script and Startup File 


In this chapter, you will learn how to write a custom linker script to define how your firmware is placed 
in the microcontroller’s memory, including allocating sections like code, data, and stack. Additionally, 
you'll develop a startup file that configures the microcontroller’s initial state, sets up the stack, initializes 
memory, and jumps to your main code. 


Chapter 5, The “Make” Build System 


Automating the build process is a critical part of embedded development. This chapter teaches you 
how to use the Make build system to streamline your workflow by creating custom Makefiles that 
automate repetitive tasks. 


Chapter 6, The Common Microcontroller Software Interface Standard (CMSIS) 


CMSIS simplifies development on ARM Cortex microcontrollers. In this chapter, you'll learn how 
to leverage CMSIS to write efficient code that takes advantage of the microcontroller’s features while 
maintaining simplicity. 
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Chapter 7, The General-Purpose Input/Output (GPIO) Peripheral 


GPIO allows your microcontroller to interact with external devices. This chapter guides you through 
developing both input and output drivers for GPIO, one of the most frequently used peripherals in 
embedded systems. 


Chapter 8, System Tick (SysTick) Timer 


Timing is essential in embedded systems, and the SysTick timer provides an easy way to generate 
precise time delays and system ticks. This chapter walks you through developing SysTick drivers for 
use in your embedded applications. 


Chapter 9, General-Purpose Timers (TIM) 


This chapter introduces you to the general-purpose timers (TIM) in STM32 microcontrollers, teaching 
you how to develop timer drivers for tasks that require precise timing. 


Chapter 10, The Universal Asynchronous Receiver/Transmitter Protocol 


Communication is a key aspect of embedded systems. This chapter focuses on the UART protocol, 
one of the most widely used communication protocols. You'll learn how to develop UART drivers, 
enabling your microcontroller to send and receive data from external devices. 


Chapter 11, Analog-to-Digital Converter (ADC) 


Many embedded applications require converting analog signals into digital data that your microcontroller 
can process. This chapter covers how to configure the ADC peripheral, allowing you to read and 
convert analog inputs into meaningful digital values. 


Chapter 12, Serial Peripheral Interface (SPI) 


SPI is a high-speed communication protocol commonly used in embedded systems. This chapter guides 
you through developing SPI drivers, enabling efficient communication between your microcontroller 
and other peripherals, such as sensors or memory devices. 


Chapter 13, Inter-Integrated Circuit (I2C) 


I2C is another popular communication protocol for connecting devices, it is often used for short- 
distance communication in embedded systems. This chapter covers the development of I2C drivers, 
allowing your microcontroller to communicate with multiple devices over a shared bus. 


Chapter 14, External Interrupts and Events (EXTI) 


Responsiveness is critical in embedded systems, and external interrupts allow your system to react 
to changes in its environment. This chapter covers how to configure and manage external interrupts 
and events (EXTI) for timely and efficient responses to external stimuli. 


xvii 


xviii 


Preface 


Chapter 15, The Real-Time Clock (RTC) 


For systems that require accurate timekeeping, the RTC peripheral is indispensable. In this chapter, 
you'll learn how to set up and use the RTC to track time in low-power systems, even when the 
microcontroller is in sleep mode. 


Chapter 16, Independent Watchdog (IWDG) 


Stability is crucial for embedded systems, and the Independent Watchdog Timer (IWDG) ensures that 
your system can recover from unexpected malfunctions. This chapter teaches you how to configure the 
IWDG to automatically reset your microcontroller if it stops responding, ensuring reliable operation. 


Chapter 17, Direct Memory Access (DMA) 


Direct Memory Access (DMA) allows data transfers to occur independently of the CPU, significantly 
improving system efficiency. This chapter covers how to configure and use DMA for memory-to- 
memory transfers, as well as for peripherals like ADC and UART, offloading the work from the CPU. 


Chapter 18, Power Management and Energy Efficiency in Embedded Systems 


Power management is essential for energy-efficient systems, especially in battery-powered devices. In 
this final chapter, you'll learn techniques for reducing power consumption, including how to use sleep 
modes, wake-up sources, and optimize firmware to achieve the best balance between performance 
and energy efficiency. 


To get the most out of this book 


To fully benefit from this book, it’s important to have a general familiarity with the C programming 
language. While we'll cover the specifics of embedded systems programming in detail, having a 
basic understanding of how code operates will make the material easier to follow. Familiarity with 
microcontrollers is certainly helpful but not a strict requirement. Everything you need will be introduced 
as we progress. Whether you're a beginner or an experienced developer, this book will guide you step- 
by-step through the fascinating world of bare-metal embedded programming. 


Software/hardware covered in the book | Operating system requirements 


STM32CubeIDE Windows 


GNU Arm Embedded Toolchain 


NUCLEO-411 Development Board 


10k Potentiometer 
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Software/hardware covered in the book | Operating system requirements 


OpenOCD 


Notepad++ 


RealTerm 


If you are using the digital version of this book, we advise you to type the code yourself or access 
the code from the book’s GitHub repository (a link is available in the next section). Doing so will 
help you avoid any potential errors related to the copying and pasting of code. 


Download the example code files 


You can download the example code files for this book from GitHub at https: //github.com/ 
Packt Publishing/Bare-Metal-Embedded-C-Programming. If theres an update to the 
code, it will be updated in the GitHub repository. 


We also have other code bundles from our rich catalog of books and videos available at https: // 
github.com/PacktPublishing/. Check them out! 


Conventions used 


There are a number of text conventions used throughout this book. 


Code in text: Indicates code words in text, database table names, folder names, filenames, file 
extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: “Copy 
the path to the openocd bin folder” 


A block of code is set as follows: 


// 22: Set PAS5(LED PIN) high 
GPIOA OD R |= LED PIN; 


When we wish to draw your attention to a particular part of a code block, the relevant lines or items 
are set in bold: 


// 1: Define base address for peripherals 
#define PERIPH BASE (0x40000000UL) 
// 2: Offset for AHB1 peripheral bus 
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Any command-line input or output is written as follows: 


monitor flash write image erase 4 makefile project.elf 


Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words 
in menus or dialog boxes appear in bold. Here is an example: “Right-click on This PC, and then 
choose Properties.” 


Tips or important notes 
Appear like this. 


Get in touch 


Feedback from our readers is always welcome. 


General feedback: If you have questions about any aspect of this book, email us at customercare@ 
packtpub. comand mention the book title in the subject of your message. 


Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. 
If you have found a mistake in this book, we would be grateful if you would report this to us. Please 
visit www. packtpub.com/support/errata and fill in the form. 


Piracy: If you come across any illegal copies of our works in any form on the internet, we would 
be grateful if you would provide us with the location address or website name. Please contact us at 
copyright@packt . com with a link to the material. 


If you are interested in becoming an author: If there is a topic that you have expertise in and you 
are interested in either writing or contributing to a book, please visit authors .packtpub.com. 
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Share Your Thoughts 


Once you've read Bare-Metal Embedded C Programming, wed love to hear your thoughts! Please click 
here to go straight to the Amazon review page for this book and share your feedback. 


Your review is important to us and the tech community and will help us make sure we're delivering 


excellent quality content. 
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Preface 


Download a free PDF copy of this book 


Thanks for purchasing this book! 

Do you like to read on the go but are unable to carry your print books everywhere? 

Is your eBook purchase not compatible with the device of your choice? 

Don't worry, now with every Packt book you get a DRM-free PDF version of that book at no cost. 


Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical 
books directly into your application. 


The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content 
in your inbox daily 


Follow these simple steps to get the benefits: 


1. Scan the QR code or visit the link below 


https://packt.link/free-ebook/9781835460818 


2. Submit your proof of purchase 


3. That’s it! We'll send your free PDF and other benefits to your email directly 
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Setting Up the Tools 
of the Trade 


In the world of embedded systems, crafting efficient firmware begins with a clear comprehension of the 
tools available. This chapter will guide you in establishing a robust development environment, ensuring 
that you are equipped with all the necessary tools for a comprehensive firmware development experience. 


Central to our discussion is the concept of datasheets. Consider these as the detailed blueprints for 
any microcontroller, encompassing its capabilities, specifications, and intricate details. However, the 
challenge often isn’t merely understanding a datasheet but also sourcing the correct datasheets tailored 
to your specific microcontroller. To address this, I will assist you in pinpointing and understanding 
both datasheets and user manuals important to our chosen microcontroller. 


As we progress, we'll delve into the intricacies of setting up our Integrated Development Environment 
(IDE) and acknowledging its critical function within the development life cycle. Furthermore, you'll 
gain insights into configuring the GNU Arm Embedded Toolchain and OpenOCD. These tools will 
later empower us to craft our firmware, bypassing the need for an IDE altogether. 


In this chapter, we will explore the following main topics: 


e Essential development tools for microcontrollers 
e The development board 
e Datasheets and manuals - unraveling the details 


e Navigating the STM32CubeIDE 
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Setting Up the Tools of the Trade 


Technical requirements 


The following are the prerequisites for the chapter: 


STM32CubeIDE: https: //www.st.com/en/development -tools/stm32cubeide. 
html 


GNU Arm Embedded Toolchain (gcc-arm-none-eabi-10.3-2021.10-win32. 
exe): https: //developer.arm.com/downloads/-/gnu-rm 


OpenOCD: https: //github.com/xpack-dev-tools/openocd-xpack/releases 
Notepad++: https: //notepad-plus-plus.org/downloads/v8.5.8/ 


STM32F11 reference manual: https: //www.st.com/resource/en/reference _ 
manual/rm0383-stm32f£411xce-advanced-armbased-32bit-mcus- 
stmicroelectronics.pdf 


STM32F411 datasheet: https: //www.st.com/resource/en/reference _ 
manual/rm0383-stm32f£411xce-advanced-armbased-32bit-mcus- 
stmicroelectronics.pdf 


NUCLEO-F411 user manual: https: //www.st.com/resource/en/user_manual/ 
um1724-stm32-nucleo64-boards-mb1136-stmicroelectronics.pdf 


Cortex-Mé4 generic user guide: https: //developer.arm.com/documentation/ 
duid0553/latest/ 


Essential development tools for microcontrollers 


In this section, we will explore the essential tools that form the backbone of our development process. 
Understanding these tools is important, as they will be our companions in transforming ideas into 
functioning firmware. 


When selecting tools for firmware development, we have two primary options. 


IDEs: An IDE is a unified software application offering a Graphical User Interface (GUI) 
tailored to crafting software — in our context, firmware. Popular IDEs for microcontroller 
firmware development include the following: 


Keil uVision (also known as Keil MDK): Developed by ARM Holdings 
STM32CubeIDE: Developed by STMicroelectronics 
IAR Embedded Workbench: Developed by IAR Systems 


These IDEs boast a GUI-centric design, enabling users to conveniently create new files, build, 
compile, and step through code lines interactively. For the demonstrations and exercises in this 
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book, we'll use the STM32CubelIDE. It has all the requisite features and is generously available 
for free, without any code size constraints. 


e Toolchains: At its core, a toolchain is a cohesive set of development tools, sequenced in distinct 
stages, to produce the final firmware build for the target microcontroller. This approach bypasses 
the comfort of a GUI. Instead, firmware is written using basic text editors such as Notepad or 
Notepad++, with the command line used to execute the various phases of the build process 
-commands such as assemble, compile, and link are often used. In this book, we'll use 
the open source GNU Arm Embedded Toolchain. Based on the renowned open source GNU 
Compiler Collection (GCC), this integrates a GCC compiler tailored for ARM, the GNU 
Debugger (GDB) debugger, and several other invaluable utilities. 


In the following section, we will carefully go through the process of setting up our preferred IDE, 
the STM32CubeIDE. 


Setting up the STM32CubelDE 


Throughout this book, we'll use both the STM32CubeIDE and the GNU Arm Embedded Toolchain 
to develop our firmware. Leveraging an IDE such as STM32CubeIDE enables us to easily analyze and 
compare the linker script and startup files, autogenerated by the IDE, against those we'll construct 
from the ground up. 


Let's start by downloading and installing STM32CubeIDE: 


1. Launch your web browser and navigate to st . com. 


2. Click on STM32 Developer Zone, and then select STM32CubeIDE. 


sample & buy V Q Support & community V Baie FAST 


Tools & Applications Solutions Ei 


software 


Figure 1.1: The home page of st.com 


3. Scroll down to the All software versions section of the page and click on Download Software. 
You'll need to log into your ST account before proceeding with the download. 
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All software versions 


Software Name Operating System Release Version 


STM32CubelDE Windows v 1.14.0 v Download Software © 


Figure 1.2: The All software versions section of the stm32cubeide page 


4. Ifyou don’t have an account, click on Login/Register to sign up. If you already have one, 
simply log in. 


5. Complete the registration form with your first name, last name, and email address. 
6. Click on Download to start the download process. A . zip file will be downloaded into your 
Downloads folder. 


Let’s install the STM32CubeIDE: 


1. Unzip the downloaded package. 

2. Double-click the st -stm32cubeide file to initiate the installer. 

3. Retain default settings by clicking Next throughout the setup process. 
4 


On the Choose Components page, ensure that both SEGGER J-Link drivers and ST-LINK 
drivers are selected. Then, click Install. 


OE} STMicroelectronics STM32CubelDE — x 
Choose Components 
Choose which features of STMicroelectronics STM32CubeIDE you 
want to install. 


Check the components you want to install and uncheck the components you don't want to 
install. Click Install to start the installation. 


: E (Desciption 
Select components to install: T 
[M] ST-LINK drivers ; 


Space required: 2.9 GB 


<i) [ea 


Figure 1.3: The installer showing the Choose Components page 
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Having successfully installed STM32CubeIDE on our computer, we will now proceed to configure 
our alternate development tool, the GNU Arm Embedded Toolchain. 


Setting up the GNU Arm Embedded Toolchain 


In this section, we will go through the process of setting up the GNU Arm Embedded Toolchain - an 
important tool for developing firmware for ARM-based microcontrollers: 


1. Launch your web browser and navigate to https: //developer.arm.com/downloads/- / 
gnu-rm. 


2. Scroll down the page to find the download link appropriate for your operating system. For 
those of you using Windows, like myself, opt for the . exe version. For Linux or macOS users, 
choose the corresponding . tar file for your operating system. 


3. After the download completes, double-click the installer to begin the installation process. 


4. Read through the license agreement. Then, choose to install in the default folder location by 


clicking Install. 
( GNU Arm Embedded Toolchain 10.3-2021.10 = XxX 
Choose Install Location aN 
Choose the folder in which to install GNU Arm Embedded Toolchain 10.3-2021. 10. © 


Setup will install GNU Arm Embedded Toolchain 10.3-2021. 10 in the following folder. To install 
in a different folder, click Browse and select another folder. Click Install to start the 
installation. 


Destination Folder 


Program Files (x86)\GNU Arm Embedded Toolchain\10 2021. 10 Browse... 


Space required: 699.4MB 
Space available: 148.5GB 


Nullsoft Install System v2,50-1 


aa a [cre 


Figure 1.4: The GNU Arm Embedded Toolchain installer 
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5. When the installation is complete, ensure that you check the Add path to environment 
variable option. 


(@ GNU Arm Embedded Toolchain 10.3-2021.10 


Completing the GNU Arm Embedded 
Toolchain 10.3-2021.10 Setup 
Wizard 


GNU Arm Embedded Toolchain 10.3-2021. 10 has been 
installed on your computer. 


Click Finish to dose this wizard. 


Figure 1.5: The installer showing the Add path to environment variable option 


6. Click Finish to finalize the setup. 


Setting up OpenOCD 


For firmware development with the GNU Arm Toolchain, OpenOCD plays an integral role, facilitating 
both the downloading of firmware into our microcontroller and the debugging of code in real time. 


Let’s set up OpenOCD: 


1. Launch your web browser and navigate to https: //openocd.org/pages/getting- 
openocd.html. 
2. Scroll to the section titled Unofficial binary packages. 


3. Click on the link for multiplatform binaries. 
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Unofficial binary packages 


Some special circumstances might make using a package manager or self-compiling OpenOCD impractical, so several 


nice community members provide regularly updated binary builds on their web-sites. 


e Liviu lonescu maintains multi-platform binaries (Windows 32/64-bit, Intel GNU/Linux 32/64-bit, Arm GNU/Linux 


32/64-bit, and Intel macOS 64-bit) as part of The xPack OpenOCD project. 


Figure 1.6: The Unofficial binary packages section 


4. Identify the latest version compatible with your operating system. For an exhaustive list, click 


on Show all 14 assets. 


v Assets 4 


@xpack-openocd-0.12.0-2-darwin-arm64.tar.gz 


@xpack-openocd-0.12.0-2-darwin-arm64. tar.gz.sha 


@xpack-openocd-0.12.0-2-darwin-x64.tar.gz 
@xpack-openocd-0.12.0-2-darwin-x64.tar.gz.sha 
@xpack-openocd-0.12.0-2-linux-arm.tar.gz 
@xpack-openocd-0.12.0-2-linux-arm.tar.gz.sha 
@xpack-openocd-0.12.0-2-linux-arm64.tar.gz 
@xpack-openocd-0.12.0-2-linux-arm64.tar.gz.sha 
@xpack-openocd-0.12.0-2-linux-x64.tar.gz 
@xpack-openocd-0.12.0-2-linux-x64.tar.gz.sha 
Source code (zip) 

4) Source code (tar.gz) 


OW a 4 assets 


Figure 1.7: The OpenOCD packages 


5. For Windows users, download the win32-x64 . zip version. For Linux or macOS users, 


2.11 MB 
109 Bytes 
2.18 MB 
107 Bytes 
2.34 MB 
106 Bytes 
2.41 MB 
108 Bytes 


2.46 MB 


download the corresponding . tar file for your operating system. 


6. Once downloaded, unzip the package. 


7. Navigate to the extracted folder, and then the bin subfolder. Here, you'll find the openocd. 
exe application. This is the application we shall call in the command prompt together with 
the specific script of our chosen microcontroller, in order to debug or download code onto 


the microcontroller. 
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Within the xpack-openocd-0.12.0-2|openocd| scripts directory structure, you'll find 
scripts tailored for various microcontrollers and development boards. 


Next, we need to add OpenOCD to our environment variables: 


1. Begin by relocating the entire openocd folder to your Program Files directory. 


BO! @ B=! bin - oa x 
Home Share View Q 
Pi B $ Cut x y i Th New item ~ R: Open- Pfseiet alt 
x 


Copy path # | Easy access © a Edit Select none 
Pin to Quick Copy Paste Move ‘opy Delete Rename New Properties 
access [Ê] Paste shortcut to te - tolder - History gP Invert selection 
Clipboard Organize New Open Select 


€ ~ Af > ThisPC > Local Disk (C:) > Program Files > xpack-openocd-0.12.0-2 > bin 


Search bin 


Æ Quick access 


Creative Cloud Files d 


@ OneDrive - Personal 


libftdil.dll libusb-1.0.dil openocd.exe 
E This PC 


P 30 Objects 
E Desktop 

£] Documents 
} Downloads 
d Music 

© Pictures 

H Videos 

i Local Disk (C:) 
wa Local Disk (E:) 


È Network 
3 items = = 


Figure 1.8: OpenOCD moved to Program Files, showing the path to the bin folder 
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2. Copy the path to the openocd bin folder. 
3. Right-click on This PC, and then choose Properties. 


4. Search for and select Edit the system environment variables. 


Settings 
f Home Abo ut 

Environment Variables} x Your PC is monitored and protected. 
| Edit the system environment variables See details in Windows Security 


Edit environment variables for your 


account Device specifications 
© Power & sleep Device name DESKTOP-KQTO99K 
Processor Intel(R) Core(TM) i7-3630QM CPU @ 2.40GHz 2.40 
© Battery GHz 
Installed RAM 16.0 GB (15.9 GB usable) 
œ Storage Device ID 5315E5B3-9B22-4CBC-9631-154737AEF39B 
Product ID 00330-81616-29948-AA192 
C4 Tablet System type 64-bit operating system, x64-based processor 
Pen and touch No pen or touch input is available for this display 
5i Multitasking 


Copy 


Figure 1.9: The This PC properties page 
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5. Click the Environment Variables button in the System Properties pop-up window. 


System Properties 


Computer Name Hardware Advanced System Protection Remote 


You must be logged on as an Administrator to make most of these changes. 
Performance 
Visual effects, processor scheduling, memory usage, and virtual memory 


User Profiles 
Desktop settings related to your signin 


Startup and Recovery 
System startup, system failure, and debugging information 


Figure 1.10: The System Properties pop-up window 


6. Under the User variables section of the Environment Variables popup, double-click the 
Path entry. 
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7: 
8. 


Environment Variables 


User variables for Ninsaw 


Variable 


OneDrive 
OneDriveConsumer 


Value 


C:\Users\Ninsaw\ OneDrive 
C:\Users\Ninsaw\OneDrive 


C:\Program Files (x86)\GNU Arm Embedded Toolchain\10 2021.10\... 


PyCharm Community Edition C:\Program Files\VetBrains\PyCharm Community Edition 2022.2.3\b... 
TEMP C:\Users\Ninsaw\AppData\Local\ Temp 
TMP C:\Users\Ninsaw\AppData\Local\ Temp 

New. || Edit.. Delete 

System variables 

Variable Value 
AZURE_COSMOS_DB_ACCO... pM3K4lyjYYWoZPOBDXWITWI0jO9GFSDQ8rSFPVo3jRvwz4ZnyHJuLK... 
ComSpec C:\Windows\system32\cmd.exe 
DriverData C:\Windows\System32\Drivers\DriverData 
NUMBER_OF_PROCESSORS 8 
os Windows_NT 
Path C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;... 
PATHEXT -COM:.EXE:.BAT:.CMD:.VBS:.VBE:.JS: JSE:. WSF:.WSH:.MSC 


New.. || Edit.. | Delete 


Figure 1.11: The Environment Variables popup 


In the Edit environment variable popup, click on New to create a row for a new path entry. 


Paste the previously copied OpenOCD path into this new row. 
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9. Confirm your changes by clicking OK on the various pop-up windows. 


Edit environment variable x 
C:\Program Files (x86)\GNU Arm Embedded Toolchain\10 2021.10\bin | New 
C:\Program Files\MySQL\MySQL Shell 8.0\bin\ 
C:\Users\Ninsaw\Documents\Python37\Scripts\ Edit 
C:\Users\Ninsaw\Documents\Python37\ 
YUSERPROFILE%\AppData\Local\Microsoft\WindowsApps | Browse... 
%PyCharm Community Edition% 

C:\Users\Ninsaw\AppData\Local\Programs\Microsoft VS Code\bin Delete 


C:\Program Files\Azure Data Studio\bin 
C:\Users\Ninsaw\AppData\Roaming\npm 
C:\Program Files\xpack-openocd-0.12.0-2\bin Move Up 


Move Down 


Edit text... 


Figure 1.12: The Edit environment variable popup 


Finally, we have successfully completed the setup process. We have configured two essential, standalone 
tools to develop firmware for STM32 microcontrollers - the STM32CubeIDE for an IDE, and the 
GNU Arm Embedded Toolchain, complemented by OpenOCD, to develop and debug our firmware 
without an IDE. 


Next, we will turn our attention to our development board. 


The development board 


The development board 


In this segment of the chapter, we will delve into the specifications and features of the development 
board selected for this book. 


Understanding the role of a development board 


Firstly, let’s clarify the concept of a development board. It’s essential to note that a development 
board is not synonymous with a microcontroller. While a development board might derive part 
of its name from the microcontroller mounted on it, it would be a misnomer to refer to the board 
itself as the microcontroller. A development board allows us to validate our firmware on the exact 
microcontroller variant we aim to deploy in our final product. Consequently, the firmware tested on 
our development board is assured to operate identically on the microcontroller in the end product. 
This is why companies such as STMicroelectronics offer a diverse range of development boards, 
tailored to each microcontroller in their portfolio. 


It’s also essential to contrast the role of a development board with prototyping boards, such as 
Arduino. While prototyping boards (which might not house the microcontrollers intended for the 
final product) serve as preliminary testing platforms, development boards elevate this process. They 
enable us to rigorously test concepts and the performance evaluation of our firmware on the designated 
microcontroller meant for bulk product manufacturing. For the purposes of this book, our focus will 
be on the NUCLEO-F411 development board. 


An overview of the NUCLEO-F411 Development Board 


The NUCLEO-F411 development board is equipped with an STM32F411RE microcontroller, capable 
of a peak operating frequency of 100MHz. It comes with a generous 512 Kbytes of flash memory 
and 128 Kbytes of SRAM. Furthermore, the board is equipped with several columns of berg pins, 
allowing us to effortlessly make connections using jumper wires to interface with a variety of modules 
and components - from sensors and motors to LEDs. For quick and straightforward input/output 
firmware tests, the board also features a built-in user button and LED, eliminating the immediate 
need for external components. 
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USB ST-LINK 


RESET 


USER BUTTON 


BUTTON 


USER 

LED 
BERG 
PINS 


STM32F411RE MICROCONTROLLER 


Figure 1.13: The NUCLEO-F411 development board 


Now that we're familiar with the development board, let’s delve into the various types of documentation 
that are essential for a comprehensive understanding and programming of the development board. 


Datasheets and manuals - unraveling the details 


Our main objective in this book is to write firmware code that interacts directly with the registers 
of our microcontroller. This means there’s no abstraction or intermediary library between our code 
and the target microcontroller. To achieve this, it’s important to grasp the internal architecture of the 
microcontroller, understand the addresses of each register we interact with, and know the functions 
of relevant bits within those registers. This is where datasheets and manuals come in. Manufacturers 
provide these documents for users to understand their products, which in our case refers to the 
microcontroller core architecture, the microcontroller, and the development board. 


Two distinct companies play roles in the making of our development board. The first is ARM Holdings, 
which licenses processor and microcontroller core architecture designs to semiconductor manufacturing 
firms such as STMicroelectronics, Texas Instruments, and Renesas. These manufacturers then produce 
the physical microcontroller or processor based on the licensed designs from ARM, often with their 


Datasheets and manuals — unraveling the details 


custom additions. This explains why two different microcontrollers from separate manufacturers might 
share the same microcontroller core. For instance, both the TM4C123 from Texas Instruments and 
STM32F4 from STMicroelectronics are based on the ARM Cortex-M4 core. 


Since our chosen development board, the NUCLEO-F411 from STMicroelectronics, is based on the 
ARM Cortex-M4 microcontroller core, in the following sections, we'll delve into the documentation 
for the board, its integrated microcontroller, and the underlying core. 


Understanding STMicroelectronics’ documentation 


A significant reason for the popularity of STM32 microcontrollers is STMicroelectronics’ continued 
commitment to providing comprehensive support. This includes well-organized documentation and 
various firmware development resources. 


STMicroelectronics has a range of documents, each following a specific naming convention. Let’s 
discuss those relevant to our work: 


e Reference Manual (RM): All RMs start with the letters RM, followed by a number. For instance, 
the RM for our microcontroller is RM03 83. This document details every register in our 
microcontroller, clarifying each bit’s role and providing insights on register configurations. 


e Datasheet: The datasheet is named after the microcontroller, so for the STM32F411 microcontroller, 
the datasheet is simply called STM32F411. This document provides a functional overview of 
the microcontroller, a complete memory map, a block diagram showcasing the microcontroller’s 
peripherals and connecting buses, as well as the pinout and electrical characteristics of 
the microcontroller. 


e UM (User Manual): Starting with the letters UM and followed by a number, such as UM1724 
for our NUCLEO-F411, this document focuses on the development board. It describes how 
components on our board, such as LEDs and buttons, are connected to specific ports and pins 
of the microcontroller. 


The generic user guide by ARM 


ARM provides documents for every microcontroller and processor core they design. Important to 
our discussion is the generic user guide for our microcontroller core. As were using the STM32F411, 
which is based on the ARM Cortex-M4 core, we'll refer to the Cortex-M4 generic user guide. 


This means that if we were using an STM32F7 microcontroller, which is based on the ARM Cortex-M7 
core, then we would need to get the Cortex-M7 generic user guide. The naming convention of this 
document is simply the name of the microcontroller core + the phrase generic user guide. 


As the name implies, this document provides information generic to the specific microcontroller 
core. This means that the information provided in the Cortex-M4 generic user guide applies to 
all microcontrollers based on the Cortex-M4 core, irrespective of the manufacturers of those 
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microcontrollers. In contrast, the information provided in the STMicroelectronics documentation 
applies to only STMicroelectronics’ microcontrollers. 


NUCLEO-F4 Development Board 


STM32F411 Microcontroller 


Figure 1.14: The relationship between the development board, microcontroller, and microcontroller core 


Why do we need the generic user guide? 


The generic user guide provides information on the core peripherals of the processor core. As the term 
suggests, these core peripherals are consistent across all microcontrollers, based on a specific core. The 
Cortex-M4 core has five core peripherals - the System Timer, Floating-Point Unit, System Control Block, 
Memory Protection Unit, and the Nested Vectored Interrupt Controller. When developing bare-metal 
drivers for these peripherals, the generic user guide is the definitive source for the essential details. 


Additionally, the guide provides information on the microcontroller core’s Instruction Set, as well as 


the Programmer’s Model, Exception Model, fault handling, and power management. 


Getting the documents 
To obtain the aforementioned documents, you can use the following search phrases on Google: 


e RM: Either STM32F11 Reference Manual or RM0383. 
e Datasheet: STM32F411 Datasheet. 
e UM:Nucleo-F11 User Manual or UM1724. 


e Generic user guide: Cortex-M4 Generic User Guide 


Navigating the STM32CubelDE 


Direct links to these documents are also available, in the Technical requirements section of this chapter. 


Before analyzing the key areas of the various documents to program our development board, let's 
first take a closer look at the STM32CubeIDE we installed earlier. We will familiarize ourselves with 
its features and functionalities in the next section. 


Navigating the STM32CubelDE 


When you launch STM32CubeIDE for the first time, you'll see the Information Center. This center 
offers quick access to a number of valuable resources for STM32 firmware development. 


To exit the Information Center, simply click the X on its tab. If you wish to revisit it later, simply 
navigate to Help | Information Center. 


Edit Source Refactor Navigate Search Project Run Window 


E Information Center X 


@® stwz2cubelDE Home 


Start a project 


o 


Start new 
STM32 
project 


m 


Start new 
project from 
STM32CubeMX 
file (.i0c) 


Figure 1.15: Information Center 


The STM32CubelIDE is based on the Eclipse framework, and therefore, the layout and elements are 
similar to those of other Eclipse-based IDEs. 
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Let’s go through the process of creating a new project: 


1. Either click Create a New STM32 project in the empty Project Explorer pane or select File 
| New | STM32 Project. 


IDE} tempWorkspace - STM32CubelDE 
File Edit Source Refactor Navigate Search Project Run Window Help 
ne Dr RrIP Oe O Br B- G- Grit: 


R Project Explorer X Sie & & fl 


0 


There are no projects in your workspace. 

To add a project: 
Create a new Makefile project in a directory containing existing code 
[| Create a new C or C++ project 
fioe] Create a New STM32 project 


fmx] Create a New STM32 Project from an Existing STM32CubeMX 
Configuration File (.ioc) 


PY Create a project... 
ùg Import projects... 


Figure 1.16: A workspace showing an empty Project Explorer pane 


2. You will be presented with the Target Selection window to select the microcontroller or 
development board for your project. 


3. Click the Board Selector tab. 
4. Enter NUCLEO-F411 into the Commercial Part Number field. 


Navigating the STM32CubelDE 


G STM32 Project 


Target Selection 


Â STM32 target or STM32Cube example selection is required 


Exampl 


Board Filters 


Commercial NUCLEO-F411 

Part Number 

Q += 
PRODUCT INFO 

Type 

Supplier 

MCU / MPU Series 


Marketing Status 


Price 


MEMORY 


Ext. Flash = 0 (MBit) 


Figure 1.17: The Target Selection window 


5. From the displayed board list, select NUCLEO-F11RE, and then click Next. 
0 a e T 


* NUCLEO-F411RE 
Ext. RAM = 0 (MBit) 
© 


FEATURES 
Embedded Sensor 
User Button 


Camera 


Figure 1.18: The board list with NUCLEO-F411 selected 
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6. Give the project a name. 


7. For Targeted Project Type, select Empty. 


(55 STM22 Project 


Setup STM32 project 


Project 


EA 


M Use default location 
C:/Users/Ninsaw/Documents/tempWorkspace Browse... 
Options 
Targeted Language 
© G O C++ 
Targeted Binary Type 
@ Executable © Static Library 


Targeted Project Type 
OSTM32Cube @ Empty 


Figure 1.19: The Setup STM32 project window 


8. Click Finish to create the project. 


You will see the new project, containing all the necessary startup files and linker scripts, in the Project 


Explorer pane. 


Understanding the control icons 


[55 tempWorkspace - STM32CubelDE 
File Edit Source Refactor Navigate Search Project Run Window Help 
Ovi] B-~}~ mi a@~ By G- Gri t-O-@-iwibiaev- 
© Project Explorer X 
v [DS FirstProject 

fy) Includes 

@ Inc 

v Æ Src 


[À main.c 


[A syscalls.c 
[À sysmem.c 
@ Startup 


Figure 1.20: The Project Explorer pane showing a new project 


Understanding the control icons 


The most frequently used control icons are the New, Build, and Debug icons. 


File Edit Source Refactor i Search Project Run Window Help 
eRe "8 B-G-[R}O-a- 
New Build Debug 


Figure 1.21: The control icons 


Let’s look closely at the functions of these icons: 


e The New icon: This icon allows us to create various files, including source code, header files, 
projects, libraries, and more. This function is also accessible via File | New. 


e The Build icon: Used for building projects. This functionality can also be accessed by right- 
clicking on a project and selecting Build Project. 


e The Debug icon: This launches a debug configuration to facilitate project debugging. This 
functionality is also available by right-clicking a project and selecting Debug Project. 


These control icons provide quick access to essential functions, significantly enhancing productivity 
and streamlining the development process. 
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Summary 


In this chapter, we set out to create a robust environment for embedded firmware development, 
focusing on the careful selection and setup of essential tools. Each tool we selected plays a crucial role 
in the efficient development of firmware for microcontrollers. We explored the installation processes 
of STM32CubeIDE, the GNU Arm Embedded Toolchain, and OpenOCD, laying a solid groundwork 
for our development activities. 


We then introduced the NUCLEO-F411 development board, equipped with an STM32F411RE 
microcontroller, as our experimental platform, and we spent time identifying some of the components 
on the board. 


We also emphasized the importance of knowing the different types of datasheets and reference 
manuals, equipping us with the ability to quickly access detailed information about a microcontroller’s 
architecture and functionalities. 


Moving on to the next chapter, we will leap into developing our first bare-metal firmware, using only 
the documentation we have compiled as our guide. 
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Constructing Peripheral 
Registers from 
Memory Addresses 


Bare-metal programming is all about working directly with the registers in the microcontroller 
without going through a library, allowing us to gain a deeper understanding of the microcontroller’s 
capabilities and limitations. This approach enables us to optimize our firmware for speed and efficiency, 
which are two very important parameters in embedded systems where resources are often limited. 


In this chapter, our journey begins with an exploration of various firmware development methodologies, 
highlighting the different levels of abstraction each offers. We then proceed to learn how to identify 
the ports and pins associated with key components on our development board. This step is crucial 
for establishing a proper interface with the microcontroller’s peripherals. 


Next, we delve into defining the addresses of some peripherals using the microcontroller’s official 
documentation. This will allow us to create the addresses of the various registers in those peripherals. 


In the latter sections of the chapter, our focus shifts to practical application. We will use the register 
addresses that we've created to configure PA5 to activate the user light-emitting diode (LED) of the 
development board. 


In this chapter, were going to cover the following main topics: 
e The different types of firmware development 
e Locating and understanding the development board’s components 
e Defining and creating registers through documentation insights 


e Register manipulation - from configuration to running your first firmware 
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By the end of this chapter, you will have a solid foundation in both navigating and programming STM32 
microcontrollers at the register level. You will be equipped to write your initial bare-metal firmware, 
relying solely on information gathered from the documentation and the integrated development 
environment (IDE). 


Technical requirements 


All the code examples for this chapter can be found on GitHub at https: //github.com/ 
Packt Publishing/Bare-Metal -Embedded-C- Programming. 


The different types of firmware development 


There are several ways to develop the firmware for a particular microcontroller depending on the 
resources provided by the microcontroller’s manufacturer. When it comes to developing firmware for 
STM32 microcontrollers from STMicroelectronics, we can use the following: 


e Hardware Abstraction Layer (HAL): This is a library provided by STMicroelectronics. It simplifies 
the process by offering high-level APIs for configuring every aspect of the microcontroller. 
What is great about HAL is its portability. We can write code for one STM32 microcontroller, 
and easily adapt it for another, thanks to the uniformity of their APIs. 


e Low Layer (LL): Also from STMicroelectronics, the LL library is a leaner alternative to HAL, 
offering a faster, more expert-oriented approach that’s closer to the hardware. 


e Bare-Metal C programming: With this approach, we dive right into the hardware, accessing 
the microcontroller’s registers directly using the C language. It’s more involved but offers a 
deeper understanding of the microcontroller’s workings. 


e Assembly language: This is similar to Bare-Metal C, but instead of C, we use assembly language 
to interact directly with the microcontroller’s registers. 


Let’s compare the four firmware development methods weve discussed: HAL, LL, Bare-Metal C, 
and assembly language. Each method has its unique style and level of abstraction, impacting how 
we interact with the microcontroller’s hardware. We will use the example of configuring a general- 
purpose input/output (GPIO) pin as an output to illustrate these differences. 


HAL 


The following code snippet demonstrates how to initialize GPIOA pin 5 as an output using HAL: 


#include "stm32f4xx hal.h" 
GPIO InitTypeDef GPIO InitStruct = {0}; 
// Enable the GPIOA Clock 

HAL RCC _GPIOA CLK ENABLE () ; 


The different types of firmware development 


// Configure the GPIO pin 

GPIO InitStruct.Pin = GPIO PIN 5; 
GPIO_InitStruct .Mode = GPIO MODE _OUTPUT_PP; 
HAL GPIO Init(GPIOA, &GPIO InitStruct) ; 


Let’s analyze the snippet: 
e #include "stm32f4xx_hal.h": This line includes the HAL library specific to the 
STM32F4 series, providing access to the HAL functions and data structures 


e GPIO InitTypeDef GPIO InitStruct = {0}: Here, we declare and initialize 
an instance of the GPIO_InitTypeDef structure, which is used to configure the GPIO 
pin properties 


° HAL RCC _ GPIOA CLK ENABLE (): This macro call enables the clock for GPIO port A, 
ensuring that the GPIO peripheral is powered and can function 


e GPIO InitStruct.Pin = GPIO PIN _5: This line sets the pin to be configured, in 
this case, pin 5 of port A 


e GPIO InitStruct.Mode = GPIO MODE OUTPUT PP: Over here, we configure the 
pin as an output pin 


e HAL GPIO Init (GPIOA, &GPIO_InitStruct) : Finally, we initialize the GPIO pin 
(PAS) with the configuration settings specified in GPIO_InitStruct 


This snippet shows the ease and readability of the HAL approach in performing common hardware 
interfacing tasks. We can summarize the benefits and drawbacks of the HAL approach as follows: 


e Level of abstraction: High 

e Ease of use: Easier for beginners due to its high-level abstraction 

e Code verbosity: More verbose, with several lines of code required for simple tasks 
e Portability: Excellent across different STM32 devices 


e Performance: Slightly slower due to additional abstraction layers 


LL 


This is how we initialize GPIOA pin 5 as an output pin using the LL library: 


#include "stm32f4xx_11_bus.h" 

#include "stm32f4xx_11_gpio.h" 

// Enable the GPIOA Clock 

LL AHB1 GRP1 EnableClock (LL _AHB1 GRP1 PERIPH GPIOA) ; 
// Configure the GPIO pin 
LL GPIO SetPinMode (GPIOA,LL GPIO PIN _5,LL GPIO MODE OUTPUT) ; 
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Let’s break it down, line by line: 


#include "stm32f4xx_11_bus.h"and#include "stm32f4xx 11_ 
gpio.h" include the necessary LL library files for handling the bus system and GPIO 
functionality, respectively 


LL_AHB1_GRP1_EnableClock(LL_AHB1 GRP1_PERIPH_GPIOA) : This function 
call enables the clock for GPIO port A 


LL GPIO SetPinMode(GPIOA, LL GPIO PIN 5, LL GPIO MODE OUTPUT) : 
Finally, we set the mode of GPIOA pin 5 to output mode 


The LL library provides a more direct and lower-level approach to hardware interaction compared to 
HAL. This is often preferred in scenarios where finer control over hardware and performance is required. 


The benefits and drawbacks of the LL approach are as follows: 


Level of abstraction: Medium 
Ease of use: Moderate, with a balance between abstraction and direct control 


Code verbosity: Less verbose than HAL, offering a more straightforward approach to 
hardware interaction 


Portability: Good, but slightly less than HAL 


Performance: Faster than HAL, as it’s closer to the hardware 


Bare-Metal C 


Let’s see how to accomplish the same task using the Bare-Metal C approach: 


#define GPIOA MODER (*(volatile unsigned long *) (GPIOA BASE + 0x00) ) 
#define RCC _AHB1ENR (* (volatile unsigned long *) (RCC_BASE + 0x30) ) 


// Enable clock for GPIOA 
RCC_AHB1ENR |= (1 << 0); 


// Set PA5 to output mode 
GPIOA MODER le (1 << 10); // Set bit 10 (MODERS5 [1] ) 


The different types of firmware development 


Let’s break down the bare-metal c snippet: 


#define GPIOA MODER (*(volatile unsigned long *) (GPIOA BASE 
+ 0x00)) and#define RCC_AHB1ENR (* (volatile unsigned long *) 
(RCC_BASE + 0x30) ) define pointers to specific registers within the microcontroller’s 
memory. GPIOA_MODER points to the GPIO port A mode register, and RCC_AHB1ENR points 
to the reset and clock control (RCC) AHB1 peripheral clock enable register (ER). The use 
of the volatile keyword ensures that the compiler treats these as memory-mapped registers, 
preventing optimization-related issues. 


RCC_AHB1ENR |= (1 << 0): This line of code enables the clock for GPIOA. It does this 
by setting the first bit (bit 0) of the RCC_AHB1ENR register. The bitwise OR assignment (| =) 
ensures that only the specified bit is changed without altering other bits in the register. 


GPIOA_MODER |= (1 << 10): This line sets PA5 to output mode. In the GPIO port mode 
register (GPIOA_MODER), each pin is controlled by two bits. For PA5, these are bits 10 and 
11 (MODERS5 [1 : 0] ). The code sets bit 10 to 1 (and leaves bit 11 as 0, assuming it was already 
0), configuring PAS as a general-purpose output mode. 


With this approach, we can observe the granularity and direct control provided. By directly manipulating 
the microcontroller’s registers, it offers very high efficiency and performance: 


Level of abstraction: Low 

Ease of use: Challenging for beginners, as it requires in-depth hardware knowledge 
Code verbosity: Less verbose, direct 

Portability: Limited, as the code is often specific to a particular hardware setup 


Performance: Very high, as it allows for direct and optimized hardware manipulation 


Assembly language 


Finally, let's analyze the assembly language implementation for configuring PA5 as an output pin: 


10 


w w 


1O 


LD 
LD 
OR 
ST 


U GPIOA MODER, 0x40020000 
U RCC_AHB1ENR, 0x40023800 


; Enable clock for GPIOA 


R RO, =RCC_AHB1ENR 
R R1, [RO] 
ie Jal, Ri NL a 0) 
R R1, [RO] 
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; Set PA5 as output 
LDR RO, =GPIOA MODER 
LDR R1, [RO] 

ORIN iil, Rul, (Il xe al) 
STR R1, [RO] 


Let’s break it down: 


e EQU GPIOA MODER, 0x40020000 and EQU RCC AHB1ENR, 0x40023800 define 
constants for the memory addresses of the GPIOA mode register (GPIOA_MODER) and the 
RCC AHBI peripheral clock ER (RCC_AHB1ENR). 'EQU' is used in assembly to equate a 
label to a value or address. 


The rest of the assembly instructions perform two main tasks: 


e Enable the clock for GPIOA: 


LDR RO, =RCC_AHB1ENR: Load the address of the RCC_AHB1ENR register into the 
RO register 


LDR R1, [RO]: Load the value of the RCC_AHB1ENR register into the R1 register 


ORR R1, R1, #(1 << 0): Performa bitwise OR operation to set bit 0 of R1, turning 
on the clock for GPIOA 


STR R1, [RO]: Store the updated value back into the RCC_AHB1ENR register 


e Set PA5 as output: 


LDR RO, =GPIOA_MODER: Load the address of the GPIOA_MODER register into RO 
LDR R1, [RO]: Load the current value of the GPIOA MODER register into R1 


ORR R1, R1, #(1 << 10): Usea bitwise OR operation to set bit 10 of R1, configuring 
PAS as an output 


STR R1, [RO]: Store the updated value back into the GPIOA_MODER register 


The assembly language approach allows for extremely detailed and direct control over the microcontroller. 
We often use this in projects where high performance is crucial, and every aspect of the hardware 
needs to be precisely managed. 


Locating and understanding the development board's components 


The assembly language approach offers us the following: 


e Level of abstraction: Lowest 


e Ease of use: Most challenging, requiring a thorough understanding of the 
microcontroller’s architecture 


e Code verbosity: Can be verbose for complex tasks, due to low-level nature 
e Portability: Very limited, as it is highly specific to the microcontroller’s architecture 


e Performance: Highest, as it allows for the most optimized and direct control possible 


The following diagram shows each method and its closeness to the microcontroller’s architecture: 


Hardware Abstraction Layer(HAL) 
Low Layer(LL) 


Bare-Metal C 


Assembly Language 


Microcontroller Architecture 


Figure 2.1: Firmware development methods, arranged in order of proximity 
to the microcontroller’s architecture 


Now that we have explored the diverse approaches to firmware development for STM32 microcontrollers, 
we are ready to delve into the realm of bare-metal C programming. 


We will begin our exploration by understanding how the main components of our development board 
are connected to specific pins of the onboard microcontroller. This initial step is crucial for gaining 
insight into the hardware layout and preparing us for detailed programming tasks. 


Locating and understanding the development 
board’s components 


In this section, our focus is to pinpoint the specific ports and pins on the microcontroller to which 
the user LED, user push button, berg pins, and Arduino-compatible headers are connected on the 
development board. Understanding these connections is crucial for our programming tasks. To 
accurately identify these connections, we will consult the NUCLEO-F411 User Manual. 
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USER 
BUTTON 
USER 
LED 
BERG 
PINS 


STM32F411RE MICROCONTROLLER 


Figure 2.2: Development board showing components of interest 


Now, let’s locate the microcontroller pin connected to the User LED on the development board. 


Locating the LED connection 


Our first step is to navigate through the table of contents to find the section dedicated to LEDs. This 
can be done quickly by locating Figure 2.3 in the manual, which shows the page number for the LEDs 
section and allows us to jump directly to it. 


Click on the page number to jump to the LEDs section. 


6.4 LEDR ch cece cwewaneeencoraseeace Cae eera ha cease EE ene 23 

6.5 PUSI-DUNONG Gita. x cent wince sGebar hasan caer aumins eae bas 23 

GB: ISPOUDD) scion E N E A E E E 23 
268 UM1724 Rev 14 ky 


Figure 2.3: This is the part of the NUCLEO-F411 User Manual’s table of 
contents showing the page number of the LEDs section 


In the LEDs section, we find that the User LED, labeled User LD2, is linked to the ARDUINO” signal 
D13. This corresponds to either pin PA5 or PB13, depending on the specific STM32 target of our board. 


Locating and understanding the development board’s components 


It’s important to note the dual naming convention used in the User Manual due to the board’s 
compatibility with the Arduino IDE. In the Arduino naming scheme, pins are categorized as either 
analog (preceded by “A’) or digital (preceded by “D”). For example, digital pin 3 is denoted as D3. 
Conversely, the standard STM32 convention starts with a “P? followed by a letter indicating the port 
and then the pin number within that port, such as PAS for the 5th pin of port A. 


To determine whether pin D13 of our development board is PA5 or PB13 of the onboard microcontroller, 
we refer to Tables 11 to 23 in the manual. These tables map the ARDUINO” connector pins to standard 
STM32 pins for each development board covered in the document. Specifically, we look at Figure 2.5 
showing Table 16, which pertains to our development board model. 


Navigate to Table 11 by clicking on it as shown in Figure 2.4. This action will take you to the initial 
table in the sequence of tables. Then, scroll down until you get to Table 16. 


User LD2: the green LED is a user LED connected 
to STM32 I/O PAS (pin 21) or PB13 (pin 34) depen: 
Table 11 to Table 23 when: 


° the I/O is HIGH value, the LED is on 
° the I/O is LOW, the LED is off 


Figure 2.4: This is the part of the NUCLEO-F411 User Manual that 
shows the two possible connections of the User LED 


Upon reviewing Table 16, we find that D13 indeed corresponds to PA5. This indicates that the User LED 
on our NUCLEO-F411RE development board is connected to pin PA5 of the onboard microcontroller. 


Table 16. ARDUINO® connectors on NUCLEO-F401RE and NUCLEO-F411RE (continued) 


Connector Pin Pin name STM32 pin Function 


6 D13 PAS SPI1_SCK 


Figure 2.5: Table 16 shows that D13 corresponds to PA5 


Another useful component found on the development board is the User Push button. Let’s find the 
pin connection of this component. 


Locating the User Push button 


The User Push button on the development board is an important component for input handling in 
many embedded experiments, and understanding its connection to the microcontroller is crucial for 
effective programming and interaction. 


To locate the connection details of the User Push button on our board, we will navigate through the 
table of contents to find the section dedicated to Push-buttons. 
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This can be done quickly by locating Figure 2.6 in the manual, which shows the page number for the 
Push-buttons section and allows us to jump directly to it by clicking on the page number. 


6.5 E neni inn iba newest ames T T OE ETE, 23 
66. JPOUDD) ccccciccedcteesatersdes nde nedsececces dds secessed 23 
2168 UM1724 Rev 14 'S7 


Figure 2.6: This part of the NUCLEO-F411 User Manual provides 
the page number for the Push-buttons section 


Upon navigating to the Push-buttons section of the manual, we locate the relevant information about 
our User Push-button, identified as B1 User. The manual tells us that this button is connected to pin 
PC13 of the onboard microcontroller. 


Locating the berg pins and Arduino-compatible headers 


In the LEDs sections, we learned that our NUCLEO-F411 development board features two primary 
naming systems for its pins: the Arduino naming system and the standard STM32 naming system. 
While bare-metal programming primarily utilizes the standard STM32 naming, the pins on the board 
itself are labeled according to the Arduino system. It is important to know the actual port names and 
pin numbers of these exposed pins so that we can properly connect and program external components 
such as sensors and actuators. 


Figure 2.7 shows our NUCLEO-F411 development board with the columns of berg pins highlighted. 


BERG 


PINS PERG 


PINS 


ERF 


5 ht 
www.st.com/stm32nucleo 


Figure 2.7: This is the NUCLEO-F411 development board with berg pins highlighted 


Locating and understanding the development board's components 


The Arduino header pins of the development board are located at the sides of the berg pins columns. 
This is highlighted in Figure 2.8. 


gui? 10E Ab 


ARDUINO 
ARDUINO HEADERS 


HEADERS 


HF 
www.stcom/stm32nucleo 


Figure 2.8: This NUCLEO-F411 development board with Arduino headers highlighted 


Let’s start by finding the microcontroller pin connections of the Arduino header pins. 
Arduino-compatible headers 


To identify the standard STM32 names for the pins on the Arduino header, we navigate to the section 
titled ARDUINO? connectors using the table of contents. This directs us to Table 11, which provides 
the mappings of Arduino pins to STM32 pins. For our specific NUCLEO-F411 development board, 
we focus on Table 16, which offers the relevant mapping for our model. 


Next, let’s locate the connections of the berg pins. 
The berg pins 


Similar to our approach with other components, to find details about the berg pins, we again consult 
the table of contents in the manual and locate the section called Extension connectors. This section 
includes figures illustrating the pinouts for various NUCLEO boards. We then scroll to find the 
pinout corresponding to our specific NUCLEO model. Here, the pinout of our development board is 
presented in the standard STM32 naming system. 
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Over here, we also discover that the manual refers to the columns of male header berg pins as the ST 
morpho connector. This means that whenever the morpho connector term is used, it is referring to 
these male header berg pins. 


NUCLEO-F411RE 


CN7 CN6 pco CNS CN10 
PC10 PC11 a PC8 
PC12 PD2 PC6 
VDD E5V Pie NEE PC5 
AVDD AVDD 
BOOTO GND U5V 
NC NC NC Ne) (sb) NC 
D13 PAS 
NC IOREF |OREF a cae PA12 
PA13 RESET RESET baa nae PA11 
PA14 +3V3 +3V3 Baa tee PB12 
PA15 +5V  +5V no Bee NC 
GND GND GND a Bae GND 
PB7 GND GND PB2 
PC13 VIN VIN D7 PAB PB1 
PC14 NC D6 PB10 PB15 
PC15 PAO A0 D5 PB4 PB14 
PHO PAI At D4 PBS PB13 
PH1 PA4 A2 D3 PB3 AGND 
VBAT PBO A3 D2 PA10 PC4 
PC2 PC1 A4 D1 PA2 NC 
PC3 PCO A5 DO PA3 NC 
E Arduino HB Morpho 


Figure 2.9: Pinout of the NUCLEO-F411 development board 


In this section, we learned that NUCLEO development boards use two naming systems. Firstly, the 
Arduino naming system, which is visibly marked on the board, and secondly, the standard STM32 
naming system, detailed in the documentation. We discovered that the standard STM32 naming 
system is particularly relevant for our purposes, as it directly correlates to the pin names of the 
onboard microcontroller. 


The next section will guide us on how to access and manipulate the relevant memory locations of the 
onboard microcontroller. Our focus will be on configuring pin PA5 as an output pin. This will allow 
us to control the LED connected to PA5. 


Defining and creating registers through 
documentation insights 


In the previous section, we established that the User LED is connected to pin PA5. This means that it 
is linked to pin number 5 on GPIO PORTA. In other words, to get to the LED, we have to go through 
PORTA and then locate pin number 5 of that port. 


Defining and creating registers through documentation insights 


As illustrated in Figure 2.10, the microcontroller has exposed pins on all four sides. These pins are 
organized into distinct groups known as ports. For instance, pins in PORTA are denoted with the PA 
prefix, while those in PORTB start with PB, and so forth. This systematic arrangement allows us to 
easily identify and access specific pins for programming and hardware interfacing tasks. 


VBAT O1 VDD 
Pc13 2 vss 
PC14-OSC32_IN O03 PA13 
PC15-OSC32_OUT O4 PA12 
PHO-OSC_IN O 5 PA11 
PH1-OSC_OUT O] 6 PA10 
NRST O7 PA9 
Pco g8 PA8 
pet LQFP64 ap 
PC2 PC8 
PC3 PC7 
VSSA/VREF- PC6 
VDDA/VREF+ PB15 
PAO PB14 
PAI PB13 
PA2 PB12 


Figure 2.10: STM32F411 pinout 


In the next section, we will go through the steps to locate the precise address of GPIO PORTA. 


Locating GPIO PORTA 


To effectively interact with any part of our microcontroller, it’s essential to know the memory address 
of that specific part. Our next step is to explore the memory map of the microcontroller. By doing so, 
we can locate the address of GPIO PORTA in a step-by-step manner. 


Since our focus is now on the onboard microcontroller rather than the development board, we need 
to refer to the microcontroller’s datasheet, specifically the stm32£411re.pd£ document. 


Let’s start by navigating to the table of contents of the document. There, we'll find a section entitled 
Memory Mapping. Click on the corresponding page number to jump to that section. 


Over here, we find a comprehensive diagram that illustrates the entire memory map of the microcontroller. 
A relevant excerpt of this diagram is presented in Figure 2.11, which shows the overall memory layout. 
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OxFFFF FFFF 


internal 
0xE000 0000 ripherals 
OxDFFF FFFF 


Not used 
0xC000 0000 
OxBFFF FFFF 


Reserved 


0x6000 0000 
OxSFFF FFFF 


0x4000 0000 
Ox3FFF FFFF 


0x2000 0000 
Oxi FFF FFFF 


0x0000 0000 


Figure 2.11: Memory map 


The memory map shows that everything inside the microcontroller is addressed from 0x0000 0000 
to OXFFFF FFFF. Were interested in the part about peripherals because that’s where we find GPIOA. 


In the context of microcontrollers, a peripheral refers to a hardware component that is not part of 
the central processing unit (CPU) but is connected to the microcontroller to extend its capabilities. 
Peripherals perform specific functions and can include a wide range of components, such as the following: 


e GPIO ports: These are used for interfacing with external devices such as LEDs, switches, and 
sensors. They can be programmed to either receive input signals or send output signals. 


e Communication interfaces: These include serial communication interfaces such as a universal 
asynchronous receiver-transmitter (UART), Serial Peripheral Interface (SPI), and Inter- 
Integrated Circuit (I2C), which enable the microcontroller to communicate with other devices, 
sensors, or even other microcontrollers. 


e Timers and counters: Timers are used for measuring time intervals or generating time-based 
events, while counters can be used to count events or pulses. 


e Analog-to-digital converters (ADCs): ADCs convert analog signals (such as those from a 
temperature sensor) into digital values that the microcontroller can process. 


The peripherals mentioned here represent a selection of the common types found in microcontrollers. 
As we progress through subsequent chapters, we will delve deeper into these and other peripherals. 


Defining and creating registers through documentation insights 


In Figure 12.11, the memory map shows that the address range for all the microcontroller’s peripherals 
spans from 0x40000000 to OxSFFFFFFF. This means that GPIO PORTA’ address lies within this 
specified range. 


Peripherals base address = 0x40000000 


Let’s note down the start of the peripheral address, which we will refer to as PERIPH_BASE, 
indicating the base address for the peripherals. We will need this for calculating the address of 
GPIO PORTA. PERIPH_ BASE = 0x40000000. 


Figure 12.12 shows a zoomed-in view of the peripherals section in the memory map. Here, we observe 
that the peripheral memory is segmented into five distinct blocks: APB1, APB2, AHB1, AHB2, and 
the Cortex-M internal peripherals block, which is located at the top. 

OxE010 0000 - OxFFFF FFFF 


Cortex:M4 internal! 0xE000 0000 - OxEOOF FFFF 
OxDFFF FFFF 


Reserved 


0x5004 0000 
0x5003 FFFF 


AHB2 
0x5000 0000 


4002 6800 - Ox4FFF FFFF 
0x4002 67FF 


0x4002 0000 
4001 4C00 - 0x4001 FFFF 
0x4001 4BFF 


0x4001 0000 


0x4000 7400 - 0x4000 FFFF 
y 0x4000 73FF 


0x4000 0000 


Figure 2.12: Peripherals memory map 
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These blocks, except for the Cortex-M internal peripherals, are named after the bus systems they 
interface with - namely, the Advanced Peripheral Bus (APB) and the Advanced High-Performance 
Bus (AHB): 


e APB1 and APB2: These buses cater to lower bandwidth peripherals, providing a more efficient 
means of communication for devices that do not require high-speed data transfer. 


e AHB1 and AHB2: These are designed for high-speed data transfer and are used to connect 
high-bandwidth peripherals. They enable faster and more efficient data, control, and 
address communication. 


On pages 54 to 56 of the datasheet, we find a table delineating the boundary addresses for each bus 
and the associated peripherals. A segment of this table is shown in Figure 2.13, where we find that 
GPIOA is allocated a boundary address from 0x40020000 to 0x4002 03FF and is connected to 
the AHB1 bus. Therefore, this indicates that the addresses for all registers related to GPIO PORTA 
are encompassed within this address range. 


Bus Boundary address Peripheral 
AHB2 0x5000 0000 - 0x5003 FFFF USB OTG FS 
0x4002 6800 - Ox4FFF FFFF Reserved 
0x4002 6400 - 0x4002 67FF DMA2 
0x4002 6000 - 0x4002 63FF DMA1 
0x4002 5000 - 0x4002 4FFF 
0x4002 3C00 - 0x4002 3FFF Flash interface register 
0x4002 3800 - 0x4002 3BFF RCC 
0x4002 3400 - 0x4002 37FF Reserved 
0x4002 3000 - 0x4002 33FF CRC 
AHB1 


0x4002 2000 - 0x4002 2FFF 
0x4002 1C00 - 0x4002 1FFF 


Reserved 
GPIOH 


0x4002 1400 - 0x4002 1BFF 
0x4002 1000 -0x4002 13FF |GPIOE 
0x4002 0C00 - 0x4002 0FFF |GPIOD 
0x4002 0800 -0x4002 0BFF | GPIOC 
0x4002 0400 - 0x400207FF |GPIOB 


0x4002 0000 - 0x4002 O3FF GPIOA 


Figure 2.13: Boundary address and Bus of GPIOA 


Defining and creating registers through documentation insights 


GPIOA base address = PERIPH_BASE + 0x20000 = 0x40020000 


From the table, we find that the starting address for the GPIOA boundary is 0x40020000. 
This reveals that adding an offset of 0x20000 to the PERIPH_BASE address (which is 
0x40000000) results in the base address of GPIOA, calculated as 0x40000000 + 0x20000 
= 0x40020000. The term “offset value” refers to the value added to derive a specific address 
from a base address. In this case, the offset value for GPIOA from the PERIPH_BASE address 
is OX20000. 

Na J 


Understanding the concept of offset values is crucial for accurately calculating desired addresses in 
microcontroller programming. This understanding enables precise navigation and manipulation 
within the system’s memory map. 


Clock gating 


Having identified the exact address of GPIOA, our next step is to enable clock access to it before 
configuring its registers. This step is necessary because, by default, the clock to all unused peripherals 
is disabled to conserve power. 


Modern microcontrollers use a power-saving technique known as clock gating. In simple terms, clock 
gating involves selectively turning off the clock signal to certain parts of the microcontroller when 
they’re not in use. The clock signal is an essential part of microcontroller operations, as it drives the 
sequential logic by providing a regular pulse that synchronizes the activities of the microcontroller’s 
circuits. However, when a particular part of the microcontroller, such as a peripheral, is not actively being 
used, the clock signal to that part is disabled. This disabling prevents unnecessary power consumption 
by idle circuits. Therefore, before using any peripheral, it’s required to first enable clock access to it. 


As shown in Figure 2.14, there's a peripheral listed as RCC, which has a boundary address range from 
0x40023800 to 0x40023BFF. The functions of this peripheral include enabling and disabling 
clock access to other peripherals. 
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RCC base Address = PERIPH_BASE + 0x23800 = 0x40023800 


From the boundary address information, we can see that the RCC _ Base address is obtained 
by adding an offset of 0x23800 to PERIPH_ BASE. 


Bus Boundary address Peripheral 


AHB2 0x5000 0000 - 0x5003 FFFF USB OTG FS 
0x4002 6800 - Ox4FFF FFFF Reserved 
0x4002 6400 - 0x4002 67FF DMA2 
0x4002 6000 - 0x4002 63FF DMA‘1 
0x4002 5000 - 0x4002 4FFF Reserved 
0x4002 3C00 - 0x4002 3FFF Flash interface register 
0x4002 3800 - 0x4002 3BFF RCC 

0x4002 3400 - 0x4002 37FF Reserved 
0x4002 3000 - 0x4002 33FF CRC 

0x4002 2000 - 0x4002 2FFF Reserved 
0x4002 1C00 - 0x4002 1FFF GPIOH 
0x4002 1400 - 0x4002 1BFF Reserved 
0x4002 1000 - 0x4002 13FF GPIOE 
0x4002 0C00 - 0x4002 OF FF GPIOD 
0x4002 0800 - 0x4002 OBFF GPIOC 
0x4002 0400 - 0x4002 07FF GPIOB 
0x4002 0000 - 0x4002 03FF GPIOA 


AHB1 


Figure 2.14: Boundary address and Bus of RCC 


Having successfully determined the base addresses for the two essential peripherals needed to configure 
GPIOA pin 5 (which controls the connected LED), our next step involves using these base addresses 
to derive the specific register addresses necessary for setting the pin as an output and ultimately 
activating the LED. 


To locate the detailed information about these registers, we will refer to the reference manual (RM0383). 
This document provides comprehensive insights into all registers and their configurations. 


The AHB1 ER 


Our reference manual, RM0383, is a comprehensive document spanning over 800 pages, and some 
STM32 reference manuals even exceed 1,500 pages. The objective is not to read the entire manual 
cover to cover, but rather to develop the skill to efficiently locate specific information as needed. 
Previously, we established that GPIOA is connected to the AHB1 bus. We also learned that activating 
this peripheral requires enabling clock access through the RCC peripheral. 


Defining and creating registers through documentation insights 


The RCC peripheral in our microcontroller includes a specific register dedicated to enabling the 
clock for each bus. In STM32 microcontrollers, the naming of registers follows a straightforward 
pattern: peripheral acronym + underscore + register acronym. For example, the register responsible 
for controlling clock access to the AHB1 bus is named RCC_AHB1ENR. 


Let’s explain this a bit more: 


e RCC stands for Reset and Clock Control 
e AHB1 stands for Advanced High-Performance Bus 1 
e ENR stands for Enable Register 


This systematic naming convention simplifies the process of identifying and accessing the 
appropriate registers. 


To find the information about the RCC_AHB1ENR register, we begin by opening the reference manual. 
Next, we navigate to the table of contents and search for the section titled RCC AHB1 Peripheral Clock 
Enable Register (RCC_AHB1ENR). Once located, we click on the page number provided alongside 
this section title to directly jump to the relevant part of the document. 


Let’s begin by examining the details presented at the top of the page, as illustrated in Figure 2.15. This 
section provides key information about the register, including the following: 


e Register name: The full name of the register is provided along with its abbreviation, namely 
RCC AHB1 Peripheral Clock Enable Register (RCC_AHB1ENR). 


e Address offset: The offset from the base address is indicated as 0x30. 


e Reset value: This is specified as 000000000, indicating the value the register holds upon 
reset. In other words, the default value of the register. 


e Access type: The register supports various access types — it can be accessed without wait states 
and allows word, half-word, and byte access. 


e Register diagram: A detailed diagram of the register is included, showing all 32 bits along 
with labels for each bit. 
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6.3.9 RCC AHB1 peripheral clock enable register (RCC_AHB1ENR) 
Address offset: 0x30 
Reset value: 0x0000 0000 


Access: no wait state, word, half-word and byte access. 


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 
DMA2EN 
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15 14 13 12 11 10 9 8 7 6 5 4 3 1 0 


2 
GPIOH GPIOD | GPIOC |GPIOB} GPIOA: 
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Figure 2.15: RCC AHB1 ER 


RCC_AHB1ENR Address = RCC_BASE + 0x30 = 0x40023830 


From this information, we can accurately calculate the address of the RCC_AHB1ENR register. 
We do this by adding the RCC_AHB1ENR offset to the RCC_BASE address. The formula 
is as follows: RCC_BASE + RCC_AHB1ENR Offset = 0x40023800 + 0x30 = 
0x40023830. 


The same section of the reference manual also includes a detailed description of each bit within the 
register. We are particularly interested in the bit named 'GPIOAEN', which stands for GPIOA Enable. 
This is identified as bit number 0. Further down, at the start of the next page in the document, we 
find a precise description of bit 0, as depicted in Figure 2.16. This description explains that setting 
bito to 0 disables the GPIOA clock while setting it to 1 enables the GPIOA clock. 


Bit 0 GPIOAEN: IO port A clock enable 


Set and cleared by software. 
0: 1O port A clock disabled 
1: IO port A clock enabled 


Figure 2.16: GPIOAEN bit 


Having understood the steps required to enable clock access to GPIOA, our next section will focus 
on learning how to set and clear bits within a register. 


Defining and creating registers through documentation insights 


Setting and clearing bits in registers 


In bare-metal programming, manipulating individual bits within registers is a fundamental operation. 
This manipulation is important for configuring hardware settings, controlling peripheral devices, and 
optimizing the performance of embedded systems. Let’s start by understanding bits and registers. 


A bit is the basic unit of information in computing and digital communications, which can have a value 
of either 0 or 1. Registers, on the other hand, are small-sized storage locations within microcontrollers, 
used to store data temporarily for various operations. Registers are typically a collection of bits (such 
as 8-bit, 16-bit, or 32-bit registers), and each bit in a register can be manipulated individually. In bare- 
metal programming, the two most frequently used bit operations are setting a bit and clearing a bit. 


Setting a bit 


Setting a bit means changing its value to 1. We will often use this to activate or enable a specific 
function within a microcontroller. 


How it’s done: The bitwise OR operation (| ) is used for setting a bit: 


register |= 1 << bit position; 


In this operation, 1 is shifted left (<<) to bit position and then ORed with the current value 
of the register. The left shift operation creates a binary value where only the target bit is 1, and all 
others are 0. The OR operation then sets the target bit in the register to 1, leaving the rest unchanged. 


Let’s assume register initially holds the 0011 (binary) value and we want to set the third bit (bit 
position 2, 0-indexed). The bit-shifted value would be 0100 (binary for 1 << 2). The OR operation 
is then as follows: 


0011: original register value 
(0001 <<2) = 0100: bit-shifted value 
omai 


0111 (resulting value) 


In this example, the first, second, and fourth bits of the original register value retain their value while 
the value of the third bit is changed to 1. 


Let’s see the opposite of setting a bit. 
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Clearing a bit 


Conversely, clearing a bit means changing its value to 0, typically to deactivate a function. 
How it’s done: The combination of bitwise AND (&) and NOT (~) operations is used for clearing a bit: 


register &= ~(1 << bit position) ; 


Here, 1 is left-shifted to bit position, and then a bitwise NOT operation is applied to create a 
binary number where all bits are 1, except the target bit. The AND operation with the register clears 
the target bit, leaving others as they are. 


Let’s assume register initially holds the 0111 (binary) value and we want to clear the third bit 
(bit position 2, 0-indexed). The bit-shifted value for the mask would be 0100 (binary for1 << 2). 
To clear the bit, we use the bitwise AND operation with the bitwise NOT of the bit-shifted value. The 
operation is as follows: 


0111: original register value 

(0001 << 2) = 0100: bit-shifted value 

~0100 = 1011: bitwise NOT of bit-shifted value 
O 

AND 

OILI 


0011 (resulting value) 


In this operation, the third bit of the register is cleared (changed to 0), while the other bits retain 
their original values. 


Let’s summarize the key points from the last two sections. Firstly, we understood that enabling clock 
access to GPIOA requires setting bit 0 in the RCC_AHB1ENR register. Secondly, we explored how 
to set and clear bits in registers using bitwise operations. Moving forward, in the next section, we 
will focus on configuring GPIOA pin 5 (PA5) as an output pin. This step is crucial in our progress 
toward activating the LED connected to PAS. This will take us a step closer to activating the LED 
connected to PAS. 


The GPIO port mode register (GPIOx_MODER) 


The GPIO port mode register in STM32 microcontrollers is a specialized register used for setting the 
mode of each GPIO pin. To locate information about this register, we navigate to the table of contents 
of the reference manual and look for the section titled GPIO Port Mode Register (GPIOx_MODER). 
By clicking on the associated page number, we are directly taken to the section. 


Defining and creating registers through documentation insights 


Figure 2.17 shows the top section of the page. Here, we can observe the following details: 


e Port applicability (x = A...E and H): This notation indicates that the register information is 
relevant to multiple ports. Specifically, it applies to GPIOA_MODER through GPIOE_MODER, 
as well as GPIOH_MODER. This means the same register structure and configuration are 
consistent across these GPIO ports. 


e Address offset - 0x00: This tells us that the register’s address is the same as the base address 
of the corresponding GPIO port. In other words, for each GPIO port, the MODER register is 
located at the very beginning of the ports memory space. Therefore, the GPIOA MODE register 
address is the same as the GPIOA base address. 


e Reset values: 


* Port A: The default value for the GPIOA_MODER register is OXA800 0000. This value 
represents the initial configuration state of the GPIOA pins upon reset. 


" Port B: The GPIOB_MODER register has a default reset value of Ox0000 0280. 


* Other ports: The reset value for the MODER registers of all other specified GPIO ports is 
O0x0000 0000: 


GPIOA MODE Register Address = GPIOA BASE = 0x40020000 


8.4.1 GPIO port mode register (GPIOx_MODER) (x = A..E and H) 


Address offset: 0x00 

Reset values: 

e —0xA800 0000 for port A 

e 0x0000 0280 for port B 

e — 0x0000 0000 for other ports 
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Figure 2.17: GPIO port mode register 


We also observe that this is a 32-bit register, with its bits organized in pairs. For example, bit0 and 
bit1 together form a pair known as MODERO, bit2 and bit3 form MODER1, bit4 and bit5 
form MODER2, and so on. Each of these pairs corresponds to a single pin of the GPIO port. Specifically, 
MODERO controls the configuration of PINO of the corresponding port, MODER1 controls PIN1, and 
this pattern continues similarly for the other pins. 
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Given our objective of configuring the mode of PINS, we need to focus on MODERS. In the register, 
MODERS5 comprises bit 10 and bit11. These two bits are the required bits for setting the operational 
mode of PINS. 


The reference manual provides a truth table, illustrated in Figure 2.18, which explains the combinations 
of the two MODER bits necessary to configure a pin. This table is an invaluable resource for understanding 
how to set the bits for the desired pin configuration, whether it’s as an input, output, or an alternate 
function mode. 


Bits 2y:2y+1 MODERy[1:0]: Port x configuration bits (y = 0..15) 
These bits are written by software to configure the I/O direction mode. 
00: Input (reset state) 


01: General purpose output mode 
10: Alternate function mode 


11: Analog mode 


Figure 2.18: The MODER bits configuration 


As shown in Figure 2.18, the MODER register within the GPIO port is composed of pairs of bits, 
designated as 2y: 2y+1 MODERy [1:0], where y ranges from 0 to 15, representing each of the 16 
pins in the port (PINO to PIN15). 


In this equation, y represents the pin number. For Pin 5, y = 5. Plugging this value into the equation 
gives us the bit positions in the MODER register that correspond to Pin 5: 


The bit positions for MODERS are calculated as 2*y and (2*y) +1. 
Substituting y = 5, we get the following: 
2*5 = 10 and 2*5 + 1 = 11, which are 10 and 11, respectively. 


So, bits 10 and 11 in the MODER register (MODER5 [1 : 0] ) are the bits that control the mode of 
Pin 5. By setting these bits to specific values (00, 01, 10, or 11), we can configure Pin 5 as an input, 
general-purpose output, alternate function, or analog mode, respectively. 


The GPIO MODER register supports four distinct bit combinations, each defining a different operational 
mode for the corresponding pin: 


e 00: When both bits are 0, the corresponding pin is set as an input pin. This is the standard 
mode for pins to receive data from external sources. 


e 01: Setting the bits to 01 sets the pin function to general-purpose output. In this mode, the 
pin can send data out, for instance, to light up an LED. 


e 10: The 10 state configures the pin for alternate functions. Each pin can serve specific 
additional purposes (such as PWM output and I2C communication lines), and this mode 
enables those functions. 
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e 11: When the bits are set to 11, the pin operates in analog mode. This mode is typically used 
for ADC, useful in reading values from analog sensors. 


From this, we understand that to configure PA5 as an output, we must set bit 10 of the GPIOA_MODER 
register to 0 and bit 11 to 1. 


Let’s summarize our progress toward activating the LED connected to PAS: 


e Enabling clock access to GPIOA: We've learned how to enable the clock for GPIOA, by setting 
bito of the RCC_AHB1ENR register to 1. This step is essential to power the GPIOA for operation. 


¢ Configuring PA5 for output: We have just learned how to set PA5 as a general-purpose output pin. 


These two steps effectively configure PA5 as an output pin. The final task involves controlling the 
output state of the pin — setting it to either 1 or 0, which corresponds to on or off, respectively. This 
translates to sending either 3.3v or Ov to PAS, thus turning the connected LED on or off. To manage 
the output state of a pin, we need to interact with the Output Data Register (ODR). Locating and 
configuring this register will be the focus of the next section. 


So far, we have the following information for our quest to activate the LED connected to PAS: 


e We know how to enable clock access to GPIOA through the RCC_AHB1ENR register 


e We know how to configure the PINS of GPIOA as a general-purpose output pin. 


These two steps make PA5 act as an output pin. The final step is to be able to set the output state of 
the pin. The state can be either 1 or 0 corresponding to on or off, corresponding to sending 3.3v or 
Ov to PAS, and finally corresponding to turning on or turning off the LED connected to PAS. To set 
the output of a pin, we need to access the ODR; this shall be the focus of the next section. 


GPIO Port Output Data Register (GPIOx_ODR) 


The GPIO Port ODR in STM32 microcontrollers is used for controlling the output state of each GPIO 
pin. To find information about this register, we refer to the table of contents in the microcontroller’s 
reference manual and locate the section titled GPIO Port Output Data Register (GPIOx_ODR). Clicking 
on the page number corresponding to this section takes us straight to the required information. 


Figure 2.19 displays the top part of the page, where we can observe the following details: 


e Port applicability (x = A...E and H): This indicates that the register information is relevant 
to multiple ports. Specifically, it applies to GPIOA_ODR through GPIOE_ODR, as well as 
GPIOH_ODR. This means the same register structure and configuration are consistent across 
these GPIO ports. 


e Address offset - 0x14: This tells us that the address of the ODR register is offset by 0x14 from 
the base address of its respective GPIO port. This means that, for each GPIO port, the ODR 
register can be found at this offset from the port’s base memory address. 
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e Reset values - 0x00000000: This indicates that, by default, all bits in the ODR are set to 0 
upon a reset. This default state ensures that all GPIO pins are initially in a low output state. 


GPIOA ODR address = GPIOA_BASE + ODR_OFFSET = 0x40020014 
ODR_OFFSET = 0x14. 


8.4.6 GPIO port output data register (GPIOx_ODR) (x = A..E and H) 
Address offset: 0x14 
Reset value: 0x0000 0000 
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Bits 31:16 Reserved, must be kept at reset value. 


Bits 15:0 ODRy: Port output data (y = 0..15) 
Figure 2.19: GPIO port ODR 


Let’s delve into the structure of the bits: 


e Bits 31:16 (reserved): These bits are reserved and should not be used for any operation. 


e Bits 15:0 (ODRy: port output data, y = 0..15): These bits, ranging from 0 to 15, correspond 
to each of the 16 pins in the GPIO port. They are directly programmable and can be both 
read and written by software. Changing the value of these bits alters the output state of the 
corresponding GPIO pin. 


Let’s consider our final task of activating the LED connected to PAS: 


e GPIOA_ODR bit for PA5: Given that PA5 corresponds to the 5th pin in the GPIOA port, we 
target the 5th bit (bit 5) in the GPIOA_ODR register. 


e Setting PA5 high: To set PA5 high, we need to write 1 to bit 5 of the GPIOA_ODR register. 
This can be achieved using a bitwise OR operation, as follows: 


GPIOA_ODR |= 1 << 5; 
In this operation, we shift 1 left by 5 positions (resulting in a binary value where only the 6th bit is 1) 


and then OR it with the current value of the GPIOA_ODR register. This action sets PA5 high without 
altering the state of other pins in the port. 


Register manipulation — from configuration to running your first firmware 


Setting PA5 high will supply voltage to the connected LED, effectively turning it on. 


Now that we understand how to set the state of GPIO output pins, in the next section, we will combine 
all the pieces of information weve acquired and develop our first firmware. 


Register manipulation — from configuration to running 
your first firmware 


In this section, we will apply the knowledge acquired throughout this chapter to develop our first 
bare-metal firmware. 


We begin by creating a new project, a process we covered in Chapter 1. Here is a summary of the steps: 


1. Start a new project: Go to File | New | STM32 Project in your IDE. 


2. Select the target microcontroller: A Target Selection window will appear, prompting you to 
choose the microcontroller or development board for your project. 


3. Use the board selector: Click on the Board Selector tab. 
4. Search for our board: Input NUCLEO- F411 in the Commercial Part Number field. 


5. Select our board: From the list of boards that appear, choose NUCLEO-F411RE and then 
click Next. 


6. Name our project: Assign a name to your project, for instance, RegisterManipulation. 
7. Project configuration: In the project options, select an Empty project setup. 
8. Final step: Click Finish to create your project. 


Once the project is created, open the main . file in your project workspace. Clear all pre-existing 
text in this file to start with a clean slate for our code. 


For a clearer understanding, we will structure our code into two distinct sections. The first section will 
be titled Register Definitions, and the second will be named Main Function. 


Register Definitions 


This section of the code defines constants and macros for memory addresses and bit masks. Here are 
all the memory addresses and bit masks required for controlling the LED connected to PAS: 


// 1: Define base address for peripherals 
#define PERIPH BASE (0x40000000UL) 
// 2: Offset for AHB1 peripheral bus 
#define AHB1PERIPH OFFSET (0x00020000UL) 
// 3: Base address for AHB1 peripherals 

#define AHB1PERIPH BASE (PERIPH BASE + AHB1PERIPH OFFSET) 
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// 4: Offset for GPIOA 


#define GPIOA OFFSET O0x0000UL) 

// 5: Base address for GPIOA 

#define GPIOA BASE AHB1PERIPH BASE + GPIOA OFFSET) 
// 6: Offset for RCC 

#define RCC OFFSET 0x3800UL) 

// 7: Base address for RCC 

#define RCC BASE AHB1PERIPH BASE + RCC OFFSET) 
// 8: Offset for AHB1EN register 

#define AHB1IEN R OFFSET 0x3 0UL) 


// 9: Address of AHB1EN register 


#define RCC AHB1IEN R (* (volatile unsigned int *) (RCC_BASE 
+  AHBIEN R OFFSET) ) 


// 10: Offset for mode register 
#define MODE R OFFSET (0x00UL) 
// 11: Address of GPIOA mode register 


#define GPIOA MODE R (*(volatile unsigned int *) (GPIOA BASE + MODE R_ 
OFFSET) ) 


// 12: Offset for output data register 


#define OD R_ OFFSET (0x14UL) 

// 13: Address of GPIOA output data register 

#define GPIOA OD R (* (volatile unsigned int *) (GPIOA BASE + OD R_ 
OFFSET) ) 

// 14: Bit mask for enabling GPIOA (bit 0) 

#define GPIOAEN (1U<<0) 

// 15: Bit mask for GPIOA pin 5 

#define PINS (1U<<5) 

// 16: Alias for PIN5 representing LED pin 

#define LED PIN PIN5 


In the previous sections, we gathered base addresses, offsets, and bit masks from the documentation. 
These remain unchanged in our code. However, a notable aspect of the code snippet is the use of the 
UL suffix at the end of each hexadecimal value as well as the use of keywords such as volatile. 
Let’s delve into the significance of these in the context of C and C++ programming. 


The UL suffix 


When we see a number in the code ending with UL, it’s more than just a part of the number - it’s a 
clear instruction to the compiler about the type and size of the number: 


e Unsigned: The U in UL indicates that the number is unsigned. In other words, it’s a positive 
number with no sign to indicate it could be negative. This designation allows the number to 
represent a wider range of positive values compared to a signed integer of the same size. 


Register manipulation — from configuration to running your first firmware 


e Long integer: The L signifies that the number is a long integer. This is important because the 
size of a long integer can vary based on the system and the compiler. Typically, a long integer 
is larger than a regular int - often 32 bits, but sometimes 64 bits on certain systems. 


The UL suffix collectively ensures that these values are treated as unsigned long integers. This is 
important for firmware development and other forms of low-level programming, where the exact size 
and “signedness” of an integer can significantly impact program behavior and memory management. 
Using UL leads to more predictable, platform-independent code, ensuring that the values behave 
consistently across different compilers and systems. 


The use of “(*(volatile unsigned int *)” 


In the context of bare-metal programming, particularly in our current code snippet, we notice that 
the address of each register is prefixed with "(* (volatile unsigned int *)". 


Let’s break down what each part of this notation means and why it’s used: 


e Type casting to a pointer: The (unsigned int *) expression is a type cast. It tells the 
compiler to treat the subsequent address as a pointer to an unsigned integer. In C and C++, 
pointers are variables that store the memory address of another variable. In the context of our 
firmware, this casting means that we are directing the compiler to treat a certain address as 
the location of an unsigned integer. However, this unsigned integer is not just any number; it 
corresponds directly to the state of a 32-bit hardware register. Each bit in this 32-bit integer 
mirrors a specific bit in the register, thereby allowing direct control and monitoring of the 
hardware’ state through standard programming constructs. 


e Dereferencing the pointer: The leading asterisk (*) in * (unsigned int *) is used to 
dereference the pointer. Dereferencing a pointer means accessing the value stored at the memory 
address the pointer is pointing to. Essentially, it’s not just about knowing where the data is (the 
address); it’s about actually accessing and using that data. 


e volatile: The volatile keyword tells the compiler that the value at the pointer can change 
at any time, without any action being taken by the code. This is often the case with hardware 
registers, where the value can change due to hardware events, external inputs, or other aspects 
of the system outside the program's control. 


Without volatile, the compiler might optimize out certain reads and writes to these addresses 
under the assumption that the values don’t change unexpectedly. This is because when a compiler 
processes code, it looks for ways to make the program run more efficiently. This includes removing 
redundant operations or simplifying code paths. This process occurs during the compilation of the 
program before it’s run. If the compiler determines that a variable (including a memory-mapped 
hardware register) doesn't change its value, it might optimize the code by eliminating repeated reads 
from or writes to that variable. For example, if a value is read from a register and then read again 
later, the compiler might assume the value hasn’t changed and use the previously read value instead 
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of accessing the register a second time. The use of volatile is a directive to tell the compiler not to 
apply certain optimizations to accesses of the marked variable, maintaining the integrity of operations. 


In our code snippet, we encounter two additional terms that are very common in bare-metal 
programming: bit mask and alias. Let’s take a closer look at each of these concepts to gain a clearer 
understanding of the roles they play: 


Bit masks: A bit mask is a binary number used as a template to manipulate specific bits within 
another binary number, usually a register value, while leaving other bits unaffected. We use 
bit masks to set, clear, or toggle individual bits in a register. They work by applying bitwise 
operations (such as AND, OR, XOR) along with the mask to achieve the desired result. For 
example, a bit mask might be used to turn on a specific LED connected to a microcontroller 
pin without altering the state of other pins. A bit mask is created by setting the bits we want to 
manipulate to 1 while keeping others at 0. This mask is then combined with the register value 
using the appropriate bitwise operation. 


For example, a 0600000100 mask used with OR will set the third bit of the target register. 


Alias: An alias is simply a name given to a bit or a group of bits to make the code more readable 
and easier to manage. We use aliases to refer to specific hardware features or configurations 
represented by bits in a register. By using a descriptive name, we can make the code more 
understandable and maintainable. For example, naming a bit mask that controls an LED as 
LED PIN makes the code self-explanatory. We define aliases using preprocessor directives 
such as #define in C or C++. For example, #define LED PIN (1U<<5) creates an 
LED PINalias for the bit mask that represents the sixth pin (zero indexing) in a GPIO port ODR. 


Now, let’s analyze the second section of the code. 


Main Function 


This section of our code outlines the primary operations for controlling the LED connected to PA5 
of the microcontroller: 


// Line 17: Start of main function 


int main(void) 


{ 


// 18: Enable clock access to GPIOA 
RCC_AHB1EN R |= GPIOAEN; 


GPIOA MODE R |= Guset a jf 10a Sete bit 10 to al 
GPIOA MODE R &= ~(1U<<11); // 20: Set bit 11 to 0 


fj} Pile Grene Cie asliaicsinwlies Loog 
while (1) 


{ 


Register manipulation - from configuration to running your first firmware 


// Line 22: Set PA5 (LED PIN) high 
GPIOA OD R |= LED PIN; 
} // 23: End of infinite loop 


} // 24: End of main function 
Let’s break down each part of the code for a clearer understanding: 


e Line 17: This marks the beginning of the main function. This is the entry point of our program 
where the execution starts: 


// Line 17: Start of main function 
int main(void) 


{ 


e Line 18: This enables the clock for GPIOA. As we learned earlier, the RCC_AHB1EN_R register 
controls the clock to the AHB1 bus peripherals. The |= GPIOAEN operation sets the bit 
associated with GPIOA (GPIOAEN) in the RCC_AHB1EN _R register, ensuring that the GPIOA 
peripheral has the necessary clock enabled for its operations: 


// Line 18: Enable clock access to GPIOA 
RCC_AHB1EN R |= GPIOAEN; 


e Line 19-20: These lines configure PAS as an output pin. Setting bit 10 to 1 and bit 11 to 0 in 
GPIOA_MODE_R configures PAS in general-purpose output mode: 


GPIOA_MODE_R |= (1U<<10); // Line 19: Set bit 10 to 1 
GPIOA_MODE_R &= ~(1U<<11); // Line 20: Set bit 11 to 0 


e Line 21-23: These lines initiate an infinite loop, which is a common practice in embedded 
systems for continuous operation: 


// Line 21: Start of infinite loop 
while (1) 


// Line 22: Set PAS (LED_PIN) high 
GPIOA_OD R |= LED PIN; 
} // Line 23: End of infinite loop 


This is what is inside this loop: 


* Line 22: This line sets PA5 high. This is achieved by setting the respective bit in GPIOA_OD_R. 
The |= LED_PIN operation ensures that PA5 outputs a high signal (essentially turning 
the LED on). 


" Line 24: This marks the end of the main function: 


} // Line 24: End of main function 


53 


Constructing Peripheral Registers from Memory Addresses 


Now that we have a clear understanding of each line of the code, let’s proceed to enter the entire code 
into the main.c file. Additionally, for your convenience, this complete source code is available in 
the GitHub repository for the book. You can find it in the folder titled Chapter2. 


This is the entire code: 


// 1: Define base address for peripherals 
#define PERIPH BASE 0x40000000UL 
// 2: Offset for AHB1 peripheral bus 
#define AHB1PERIPH OFFSET 0x00020000UL) 
// 3: Base address for AHB1 peripherals 


#define AHB1PERIPH BASE PERIPH BASE + AHB1PERIPH OFFSET) 
// 4: Offset for GPIOA 

#define GPIOA OFFSET 0x0000UL) 

// 5: Base address for GPIOA 

#define GPIOA BASE AHB1PERIPH BASE + GPIOA OFFSET) 
Vi 6; Cmiesee score INCE 

#define RCC OFFSET 0x3800UL) 

// 7: Base address for RCC 

#define RCC_BASE AHB1PERIPH BASE + RCC_OFFSET) 
// 8: Offset for AHB1EN register 

#define AHB1EN R OFFSET 0x3 0UL) 


// 9: Address of AHB1EN register 


#define RCC_AHB1EN R (*(volatile unsigned int *) (RCC_BASE 
+ AHB1EN R OFFSET) ) 


// 10: Offset for mode register 
#define MODE_R_OFFSET Ox00UL) 
// 11: Address of GPIOA mode register 


#define GPIOA MODE R (*(volatile unsigned int *) (GPIOA_BASE + MODE R_ 
OFFSET) ) 


// 12: Offset for output data register 


#define OD R OFFSET (O0x14UL) 

// 13: Address of GPIOA output data register 

#define GPIOA OD R (* (volatile unsigned int *) (GPIOA BASE + OD R_ 
OFFSET) ) 

// 14: Bit mask for enabling GPIOA (bit 0) 

#define GPIOAEN (1U<<0) 

// 15: Bit mask for GPIOA pin 5 

#define PINS (eas) 

// 16: Alias for PIN5 representing LED pin 

#define EED PIN PIN5 


// 17: Start of main function 
int main (void) 


{ 


Register manipulation - from configuration to running your first firmware 


// 18: Enable clock access to GPIOA 
RCC_AHB1EN R |= GPIOAEN; 


GPIOA_MODE_R |= (1U<<10); // 19: Set bit 10 to 1 
GPIOA_MODE_R &= ~(1U<<11); // 20: Set bit 11 to 0 


ff Bile Sram Cle daetou LOE 
while (1) 


{ 
// 22: Set PAS(LED PIN) high 
GPIOA OD_ R |= LED PIN; 


} // 23: End of infinite loop 


} // 24: End of main function 


To build the project, first select it by clicking on the project name once, and then initiate the build 


process by clicking on the build icon, represented by a hammer symbol, in the IDE. 


ioe] tempWorkspace - 2_RegisterManipulation/Src/main.c - STM32CubelDE 
File Edit Source Refactor Navigate Search Project Run Window Help 


yi OOS -laixi@ia- a-e-e-i¢[0]9-:9 9- Bem 1: di~ A 


R Project Explo.. X = B [È mainc x 

BSy 3 1 // 

(55 1_FirstProject (in FirstProjec 
IDE 2_RegisterManipulation 


1: Define base address for peripherals 
2 #define PERIPH_BASE (0x40000009UL ) 
3 // 2: Offset for AHB1 peripheral bus 
4 #define AHB1PERIPH_OFFSET (0x00020000UL) 
5 // 3: Base address for AHB1 peripherals 
6 #define AHB1PERIPH_BASE 
7 // 4: Offset for GPIOA 
8 #define GPIOA_OFFSET (@x@@@8UL ) 

9 // 5: Base address for GPIOA 
10 #define GPIOA_BASE 
11 // 6: Offset for RCC 


(PERIPH_BASE + AHB1 


(AHB1PERIPH_BASE + 


12 #define RCC_OFFSET (@x38Q@UL ) 

13 // 7: Base address for RCC 

14 #define RCC_BASE (AHB1PERIPH_BASE + 
15 // 8: Offset for AHB1EN register 


16 


#define AHB1EN_R_OFFSET 


Figure 2.20: The build and run icons 


(@x3@UL ) 
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Once the build process is complete, the next step involves uploading the program to the microcontroller. 
To do this, we make sure the project is selected in the IDE, and then initiate the upload by clicking 
on the run icon, represented by a play symbol. 


The first time you run the program, the IDE may prompt you to edit or confirm the launch configurations. 
A dialog box titled Edit Configurations will appear. It’s sufficient to accept the default settings by 
clicking OK. Figure 2.21 shows the Edit Configuration dialog box. 


© Etait Configuration Oo x 


Edit launch configuration properties Q 


Name: | 2_RegisterManipulation | 
[D Main | $ Debugger) @ Startup t Source [Z] Common 


Project: 

f 2_RegisterManipulation ] Browse... 
C/C++ Application: 

[ Debug/2_RegisterManipulation.elf ] Search Project... Browse... 


Build (if required) before launching 


Build Configuration: Select Automatically v 
O Enable auto build O Disable auto build 
@ Use workspace settings Configure Workspace Settings... 
Revert Apply 


Figure 2.21: The Edit Configuration dialog box 


Following this confirmation, the IDE will commence the process of uploading your program to the 
microcontroller. When the upload is complete, you should see the green LED on the development 
board light up. This indicates that our bare-metal code is functioning as intended and successfully 
controlling the hardware. 


Summary 


We have now mastered configuring PA5 to control the LED, starting from the very basics of consulting 
the documentation for accurate addresses, effectively typecasting these addresses for register access, 
and creating aliases for specific bits within the registers. 


Summary 


In this chapter, we delved into the core of bare-metal programming, emphasizing the direct interaction 
with microcontroller registers. This gave us insight into some of the key registers of our microcontroller 
and the structure of those registers. 


We began by exploring various firmware development approaches, each offering a distinct level of 
abstraction. These approaches included the HAL, LL, Bare-Metal C programming, and the assembly 
language. This exploration helped us understand the trade-offs and applications of each approach in 
firmware development. 


We spent a significant part of the chapter defining addresses of some peripherals using the official 
documentation. This step was important in creating addresses for various registers within those 
peripherals. It involved gathering specific memory addresses, typecasting them for register access, 
and creating aliases for bits in the registers. 


The chapter culminated in a practical application where we configured PA5 to control the User LED 
on the development board. This exercise integrated the concepts discussed throughout the chapter, 
showcasing how theoretical knowledge can be applied in real-world firmware programming. 


In the upcoming chapter, we will delve into the build process. This exploration will allow us to 
understand the sequential stages that our source code undergoes to become an executable program 
capable of running on our microcontroller. 


57 


3 

Understanding the Build 
Process and Exploring 
the GNU Toolchain 


Bare-metal programming is a journey of deep understanding and precision, and in this chapter, we 
will navigate the complex realm of the embedded firmware build process. Our focus is the GNU Arm 
Toolchain, an important element in firmware development. Through a blend of theory and hands-on 
programming exercises, you will gain insights into how integrated development environments 
(IDEs) streamline the build process and how these processes can be manually replicated using the 
GNU Arm Toolchain. 


As the chapter progresses, we delve into the nuances of the compiler and its various options, tailored 
for Arm Cortex microcontrollers. The programming exercise in this chapter is designed to help you 
understand and effectively utilize the GNU tools, from compiling and linking to analyzing the depths 
of the output object files. 


In this chapter, we shall cover the following main topics: 
e The foundations - understanding the embedded build process 
e A tour of GNU binary tools for embedded systems 
e From IDE to command-line - watching the build process unfold 
By the end of the chapter, you will have an understanding of the embedded firmware build process 


and have developed the skills to fluidly switch between IDEs and command-line interfaces, enhancing 
your versatility as a firmware developer. 
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Technical requirements 

All the code examples for this chapter can be found on GitHub at https: //github.com/ 
Packt Publishing/Bare-Metal -Embedded-C- Programming. 

The foundations - understanding the embedded build 
process 


The journey from high-level source code in embedded firmware development to an executable binary 
image is intricate and multilayered. This process is commonly referred to as the firmware build process 
and involves several critical stages - pre-processing, compilation, assembly, linking, and locating. Each 
of these stages plays an important role in transforming human-readable code into machine-executable 
instructions. Figure 3.1 shows the entire build process and the tools involved at each stage of the process. 


Source files Preprocessed files Assembly files 
(Cassenbier T) B Relocatable | o () | Executable 


Object files 


Figure 3.1: The build process, detailing the input and output files for each stage and the specific tools used 


Let’s examine the stages. 


The pre-processing stage 


Pre-processing is the initial stage in the firmware build process. In this stage, the source code undergoes 
a series of transformations to prepare it for compilation. Typically, source files in embedded systems are 
written in C (.¢ files) and accompanied by header files ( . h files). The preprocessor is the specialized 
tool in the build process whose task is to handle these input files. 


During pre-processing, the preprocessor executes several key operations: 


e Stripping comments: Comments, which are crucial for human readability but irrelevant for 
machines, are removed from code. 


The foundations - understanding the embedded build process 


e Evaluating preprocessor directives: Lines starting with the # symbol, known as preprocessor 
directives, are processed. These directives often include macro definitions (#define), 
conditional compilation instructions (#ifdef, #ifndef, #endif), and file inclusion 
commands (#include). The preprocessor replaces these directives with their defined values 
or corresponding code segments. 


e Output generation: The output of this stage is a set of intermediate files, typically with the . i 
extension. These files represent the transformed source code, devoid of comments and with 
all directives evaluated. 


The next stage is the compilation stage. 


The compilation stage 


Compilation starts immediately after pre-processing. The compiler’s role is to take . i files and 
convert them into architecture-specific assembly code. This phase is where the high-level constructs 
of C are translated into the lower-level, more granular assembly instructions understood by the target 
processor architecture. 


This stage involves the following: 


e Input: The pre-processed . i files 


e Process: The compiler analyzes the code structure, optimizes it for performance and space, 
and translates it into assembly language 


e Output: The result of the compilation stage is a set of assembly code files, typically with the 
. s extension 


The next stage makes use of the . s files. 


The assembly stage 


Assembly is the stage where the . s files containing assembly code are converted into machine code, 
in the form of object files. This stage translates the human-readable assembly instructions into a 
binary format. 


This stage involves the following: 


e Input: Assembly code files (. s). 


e Process: The assembler interprets each assembly instruction and converts it into corresponding 
machine code. 


e Output: Object files, usually with the . o extension. These files contain binary code and are 
ready for the next stage of linking. 
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The linking stage 


Linking is the stage where all individual object files are combined to form a cohesive program. This 
stage also integrates any necessary standard library files and resolves references between different 
code modules. 


This stage involves the following: 


e Input: Object files ( . o) and C standard library files. 


e Process: The linker stitches together all object files, resolving symbolic references and addresses. 
It handles tasks such as memory allocation for variables and functions. 


¢ Output: The linker generates a relocatable file, which is comprehensive but not yet final 
executable code. 


A relocatable file is intermediate output in the firmware build process, which is created during the 
linking stage. It’s a comprehensive file that combines all individual object files (.o) and the necessary 
library. However, it is not yet a final executable. The concept of relocation plays an important role here. 
Relocation involves adjusting the symbolic addresses in the relocatable file to actual, specific memory 
locations. This process ensures that when the firmware runs on a target device, each part of the code 
and data is correctly placed in memory. The relocatable file contains all the necessary components 
of the firmware, but with addresses that are still “relative” - they need further adjustment during the 
locating stage to fit the unique memory map of the target microcontroller, leading to the creation of 
the final executable. 


The next stage is where we finally get our executable code. 


The locating stage 


This is the last stage; it involves converting the relocatable file into the final executable binary. This 
stage is guided by a linker script, which provides essential information about the memory layout of 
the target device. 


This stage involves the following: 


e Input: A relocatable file and a linker script. 


e Process: The locator uses the linker script to place code and data sections into their designated 
memory locations. It adjusts addresses and offsets to fit the targets memory map. 


e Output: An executable binary file, typically in formats such as Executable and Linkable Format 
(ELF) or a plain binary format. 


With this knowledge in hand, we are well-equipped to delve into the GNU Toolchain for Arm. Our 
goal is to effectively utilize specific tools within the toolchain to execute the various stages of the build 
process. This will be the focus of the next section. 


A tour of GNU binary tools for embedded systems 


A tour of GNU binary tools for embedded systems 


In this section, well delve into the GNU Bin tools, a suite of tools that come with the installation of the 
GNU Toolchain for Arm. These tools (commands) are essential for the various stages of the firmware 
build process, as well as additional tasks such as debugging. 


The first command we'll explore is arm-none-eabi-gcc. 


arm-none-eabi-gcc 
Let’s break down the components of the command: 


e arm: This specifies the target architecture. 


e none: This component indicates the operating system for which the code is being compiled. 
Here, none signifies that the code is meant for a bare-metal environment, meaning it will run 
directly on the hardware without an underlying operating system. 


e eabi: This stands for Embedded Application Binary Interface. FABI defines a standard for 
the binary layout of system and user programs, libraries, and so on. It ensures that the compiled 
code will work correctly on any Arm processor that adheres to the EABI standard. 


e gcc: This is short for GNU Compiler Collection. 


This single command compiles, assembles, and links our input code in one go. To use it, type 
arm-none-eabi-gcc in the command prompt or terminal, followed by the source file, then - 0, 
and the desired output filename. 


See Figure 3.2 for an example: 


command compiler flag 
| 
| \ | 


arm-none-eabi-gcc main.c —o main.o 


| | 


source file output file 


Figure 3.2: Usage of the arm-none-eabi-gcc command 


In this example, we specify the source file as main . c and the output file as main.o. 
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Some common compiler flags 


Since we have just introduced the -o compiler flag, let’s take this opportunity to introduce some of 
the other commonly used compiler flags. These compiler flags are used to modify command behavior 
as well as add options to commands: 


e -c: This flag is used to compile and assemble but not link. When added to the command, it 


processes the code up to the assembly stage but stops before linking. 


e -o file: As mentioned earlier, this specifies the name of the output file. 


e -g: Generates debugging information in the executable. 


e -Wall: Enables all warning messages, helping us identify potential issues in the code. 


e -Werror: Treats all warnings as errors, ensuring code quality and stability. 


e -I 


large projects. 


e -ansi and -std=STANDARD: These flags specify which standard version of the c language 


should be used. 


e -v: Provides verbose output from GCC, giving us detailed information about the 


compilation process. 


[DIR]: Includes a specified directory to search for header files; it’s useful for organizing 


Table 3.1 provides a summary of the flags and example usage for each flag. 


Flag Purpose Example usage 
-C Compile and assemble but dor't link arm-none-eabi-gcc -c 
source_file 
-o file | Link to the output file, named file arm-none-eabi-gcc source_ 
file -o output_file 
-g Generate debugging info in arm-none-eabi-gcc -g 
the executable source_file 
-Wall Enable all warning messages arm-none-eabi-gcc -Wall 
source_file 
-Werror Treat warnings as errors arm-none-eabi-gcc -Werror 
source file 
-I [DIR] | Include a directory for header files arm-none-eabi-gcc -I 
directory path source _ 
file 
-ansi Use the American National Standards | arm-none-eabi-gcc -ansi 
Institute (ANSI) standard source file 


A tour of GNU binary tools for embedded systems 


Flag Purpose Example usage 


-std Specify a standard version (e.g., C11) arm-none-eabi-gcc 
-std=c11 source file 


-v Verbose output from GCC arm-none-eabi-gcc -v 
source_file 


Table 3.1: Some compiler flags and their example usage 


Some architecture-specific flags 


In addition to the general compiler flags, there are several architecture-specific flags that enable 
precise configuration for various processor architectures. These flags are integral for tailoring the 
build process to specific ARM processors and their respective architectures. Let’s delve into some of 
the most frequently used architecture-specific flags: 


e -mcpu= [NAME]: Specifies the target ARM processor. Using this option configures the compiler 
to optimize the code for a specific processor. 


e -march= [NAME]: Specifies the target ARM architecture. It configures the compiler for a 
particular ARM architecture version. 


e -mtune= [NAME]: Similar to -mcpu, this specifies the target ARM processor for 
optimization purposes. 


e -thumb: Configures the compiler to generate code for the Thumb instruction set, which 
is a compressed version of the standard ARM instruction set, providing more code density 
and efficiency. 


e -marm: Instructs the compiler to generate code for the ARM instruction set. 


e -mlittle-endian/-mbig-endian: These options specify the endianness for the generated 
code. Little-endian is the most common format in ARM processors. 


Let’s see an example involving some of these flags: 


arm-none-eabi-gcc -c -mcpu=cortex-m4 -mthumb main.c -o main.o 


4 


| J 
| 
processor 


compile and assemble, Instruction Set output file 
do not link. 
source file 


Figure 3.3: Usage of the arm-none-eabi-gcc command with some architecture-specific flags 
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In this example, we say the following: 


-c: Compile and assemble but do not link 
-mcpu=cortex-m4é: Build for the Cortex-M4 processor 
-mthumb: Use the Thumb instruction set 


-o main.o: Output the compiled file as main.o 


Table 3.2 provides a summary of the architecture-specific flags and example usage for each flag. 


Flag Purpose Example usage 
-mcpu= [NAME] Specify the target ARM processor -mcpu=cortex-m4 
-march= [NAME] Specify the target ARM architecture -march=armv7-m 
-mtune= [NAME] Optimize for a specific ARM processor -mtune=cortex-m4 
-thumb Generate code for the Thumb instruction set | -mthumb 

-marm Generate code for the ARM instruction set -marm 
-mlittle-endian | Generate code for little-endian mode -mlittle-endian 
-mbig-endian Generate code for big-endian mode -mbig-endian 


Table 3.2: Some architecture-specific compiler flags and their example usage 


Other commands in the GNU Toolchain for Arm 


Apart from the arm-none -eabi-gcec command, there are other important commands that we will 
frequently use when building with the GNU toolchain for Arm. Let’s examine some of these commands: 


arm-none-eabi-nm: The arm-none-eabi -nm command is a handy tool for listing the 
symbols from an object file. Symbols in this context refer to various identifiers in a program, 
such as function names, variable names, and constants. This tool is invaluable for examining 
the contents of compiled files, offering us insights into the structure and components of our 
program. This can be particularly useful for debugging purposes. 


arm-none-eabi-size: In embedded firmware development, where memory resources are 
often limited, understanding the memory footprint of different sections of our code is crucial. 
This tool provides valuable insights into how much memory the various parts of our code 
consume, allowing us to make informed decisions about optimization and memory management. 


arm-none-eabi-objdump: This tool is used to extract and display detailed information 
from object files. It offers an in-depth view of the machine instructions, making it an invaluable 
resource for thorough analysis of object files. This includes capabilities such as disassembling 
code, presenting section headers, and revealing symbol tables. Its utility becomes crucial when 
we need to delve into the intricate details of compiled code, providing clarity on a file’s structure, 
content, and operational mechanics. This helps us to both debug and optimize our code. 


A tour of GNU binary tools for embedded systems 


Disassembling code refers to the process of converting machine code, which is a set of binary 
instructions that a computer’s processor can execute directly, back into assembly language. 
Assembly language is a more human-readable form of instructions, although it’s still quite 
low-level compared to the C language. Figure 3.4 presents a comparison of C-language code, 
its corresponding assembly language translation, and the resulting machine code. 


// 18: Enable clock access to GPIOA 
RCC_AHB1EN_R |= GPIOAEN; 


4b@c 
681b 


4a@b 
F043 9301 
6013 


ldr r3, 
ldr 
ldr 
orr. 
str 


[pc, #48] 3 (38 <main+0x38>) 
[r3, #0] 

[pc, #44] 3 (38 <main+0x38>) 
r3, #1 

[r2, #0] 


Figure 3.4: C-language code, its corresponding assembly language code, and the resulting machine code 


e arm-none-eabi-readelf: This tool provides detailed information about the output 
ELF file, including section headers, program headers, and symbol tables. It is useful when we 
work with ELF files, as it offers insights into how an executable is structured and prepared to 


run on a system. 


e : arm-none-eabi-objcopy: We use this tool to convert object files from one format to 
another or to make a copy of an object file. 


Table 3.3 provides a summary of these additional tools and example usage for each tool. 


readelf 


Tool Function Example Usage 

arm-none-eabi-nm Lists symbols from arm-none-eabi-nm [object 
object files file] 

arm-none-eabi-size Lists section sizes of arm-none-eabi-size [file] 
object/executable files 

arm-none-eabi - Dumps information arm-none-eabi-obj dump 

obj dump about object files options] [object file] 

arm-none-eabi- Displays information arm-none-eabi-readelf 


about ELF files 


options] [ELF file] 


arm-none-eabi- 
obj copy 


Converts/copies object | arm-none-eabi-obj copy 
files between formats options] [input file] 


output file] 


Table 3.3: Some common commands in the GNU toolchain for Arm and their example usage 
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In this section, we explored the GNU Binary Tools that are essential for embedded firmware development. 
These tools, including commands such as arm-none-eabi-gcc, play important roles in various 
stages of the firmware build process and are invaluable for tasks such as compiling, linking, and 
debugging. The next section will further expand our understanding of these tools; we'll dive into their 
practical applications, demonstrating their utility in the embedded firmware build process. 


From IDE to the command line - watching the build 
process unfold 


In this section, our aim is to understand how the compiler within the IDE handles our code when we 
initiate a build. Additionally, we will delve into the practical applications of some of the GNU Binary 
Tools we discussed in the previous section. 


Observing the build process from the IDE’s perspective 


Let’s start by revisiting the bare-metal GPIO driver we developed in the previous chapter. 


Begin by launching your STM32CubeIDE. To conduct a proper analysis of the build commands that 
the IDE executes, it’s necessary to first clean the project and then build again. The reason for this is 
straightforward - we've already built the project, in a previous chapter. Without any modifications to 
the source code since then, a new build attempt would skip the detailed command execution we aim 
to scrutinize, since the source code hasn't changed. 


Let’s clean the project: 


1. Locate the project in the Projects pane. 

2. Right-click on the name of the project. 

3. A menu will appear. From this menu, select the Clean Project option. 

4. The IDE will now clear any already compiled data from the project. This action resets the build 
state of the project to its initial condition, erasing any previous build results. 


Now, let’s build it again: 


1. Right-click once more on the same projects name in the Projects pane. 
2. Again, a menu will appear. This time, select the Build Project option. 


3. By selecting this option, the IDE will start the process of building the project from scratch. 


To observe the build commands in the STM32CubeIDE, we must find the Console pane, which 
is usually positioned at the bottom area of the IDE’s interface. The Console pane is an important 
component that displays real-time outputs and logs for various actions, including the build process. 
This makes it an invaluable tool to monitor the commands and actions undertaken during the 
compilation of our projects. 
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If the Console pane is not readily visible in your current IDE layout, you can access it through the 
menu bar located at the top of the IDE. Simply click on the Window menu to reveal a drop-down 
list of options. From there, navigate to Show View, which will expand to show more choices. Among 
these, select Console to bring the pane into view within your workspace. 


Once you have the Console pane visible, you can monitor the execution of the build commands. Whenever 
we initiate the build process, each command used during the build - encompassing compilation, linking, 
and other stages - will be displayed in this pane. Figure 3.5 shows some of the content of Console 
pane after building our 2. RegisterManipulation bare-metal GPIO driver project. Figure 3.5 
shows some of the content of the Console pane after building our 2_RegisterManipulation 
bare-metal GPIO driver project: 


[E Problems $) Tasks Œ Console x [Z] Properties 
CDT Build Console [2_RegisterManipulation] 


21:38:13 **** Build of configuration Debug for project 2_RegisterManipulation **** 

ymake -j8 all 
2arm-none-eabi-gec -mcpu=cortex-m4 -g3 -DDEBUG -c -x assembler-with-cpp -MMD -MP -MF"Start 
3arm-none-eabi-gcec "../Src/main.c" -mcpu=cortex-m4 -std=gnu11 -g3 -DDEBUG -DNUCLEO_F411RE 
4arm-none-eabi-gcec "../Src/syscalls.c" -mcpu=cortex-m4 -std=gnul1 -g3 -DDEBUG -DNUCLEO_F41 
Sarm-none-eabi-gcec "../Src/sysmem.c" -mcpu=cortex-m4 -std=gnu11 -g3 -DDEBUG -DNUCLEO_F411R 
6arm-none-eabi-gcc -o "2_RegisterManipulation.elf" @"objects.list" -mcpu=cortex-m4 -T"C: 
7Finished building target: 2_RegisterManipulation.elf 
8arm-none-eabi-size 2_RegisterManipulation.elf 
Qarm-none-eabi-objdump -h -S 2_RegisterManipulation.elf > "2_RegisterManipulation.list" 

text data bss dec hex filename 
720 8 1568 2296 8f8 2_RegisterManipulation.elf 
Finished building: default.size.stdout 


Finished building: 2_RegisterManipulation.list 


< 


Figure 3.5: The Console pane after building the 2_RegisterManipulation bare-metal GPIO driver project. 


Figure 3.5 provides a snapshot of the build process steps, although it only displays a segment of each 
step. For a comprehensive view of all the steps, including the full lines of commands and responses, 
you should refer to the Console pane in your STM32CubeIDE. 


Let's analyze the console pane in Figure 3.5, according to the line numbering.. 


Compilation of assembly and C files 
Line (1) 


make -j8 all: This isa command to the make build automation tool, requesting it to execute 
the build. We shall learn about make in the upcoming chapters. 


Lines (2)(3)(4)(5) 
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The following lines are specific arm-none-eabi-gcc commands to compile individual source 
files such as main.c, syscalls.c, and sysmem. c. These commands specify the target CPU 
(-mcpu=cortex-m4) and other compiler flags. Each source file is compiled into an object file ( . o). 


Linking process 
Line (6) 


The arm-none-eabi-gcc command with a list of object files (@"objects.list") anda 
linker script (STM32F411RETX_FLASH. 14) links these object files into an executable file (2_ 
RegisterManipulation.elf). The linker script guides how different sections of the code and 
data are placed in the final executable. We shall discuss linker scripts in the next chapter. 


Size calculation: 
Line (8) 


arm-none-eabi-size 2 RegisterManipulation.elf: This command calculates 
the size of the compiled program, breaking it down into text (code), data (initialized data), and bss 
(uninitialized data) sections. The output shows the size of these sections in bytes and their total in 
both decimal (dec) and hexadecimal (hex) formats. 


Creation of a list file: 
Line (9) 


arm-none-eabi-objdump -h -S 2 RegisterManipulation.elf > "2_ 
RegisterManipulation.1list": This command disassembles the executable and outputs a 
detailed list file. The -h flag shows the header information, and -S intersperses source code with 
disassembly. A list file is a detailed textual representation of compiled code, containing both the 
assembly language instructions and their corresponding machine code, often with annotations of 
the original high-level source code. 


As we can observe from these logs, it is clear that our STM32CubeIDE employs the same GNU Binary 
Tools previously discussed. In our forthcoming section, we will manually execute these commands 
via the command line. This approach will teach us how to build our firmware without using an IDE, 
simply using the source text files, the command-line interface, and our suite of GNU Bin Tools. 


Working with the GNU bin tools 


In this section, our focus is to execute some of the GNU Bin Tools directly, using our command line. 
To start I want to show you why I use the words commands and tools interchangeably to describe the 
GNU Bin Tools. 


From IDE to the command line - watching the build process unfold 


Locate the installation folder for the GNU Arm Embedded Toolchain on your computer. On my 
computer, this is C:/Program Files (x86) /GNU Arm Embedded Toolchain. 


Once you've located the GNU Arm Embedded Toolchain folder, the next step involves accessing the 
bin folder within it. 


Upon opening the bin folder, you'll be greeted with a plethora of tools, each represented by an 
executable file (. exe). Looking closely, you will find arm-none-eabi-gcc . exe, our compiler, 
along with other tools we previously discussed. When we input a command corresponding to these 
tools in the command line, the associated . exe file is executed. For instance, entering arm-none - 
eabi-gcc in the command line will run the arm-none-eabi-gcc. exe executable. 


Now that we have clarified that, it’s time to shift our focus toward practical testing. However, before 
diving into this testing phase, a few essential preparatory steps are required. Let’s create a backup of 
our current project, 2_RegisterManipulation: 


1. Locate your project: Navigate to your workspace or the folder where the 2 __ 
RegisterManipulation project is stored. 


2. Copy the project: Right-click on the 2. RegisterManipulation project folder, select 
Copy, and then Paste within the same directory. 


3. Rename the Backup: Rename the newly pasted folder to 2_ RegisterManipulation-old. 
With the backup in place, our next move is to modify the main.c file in the 2. Register 
Manipulation project, changing the LED’s behavior from a constant on state to a blinking one: 

1. Access the source File: Go to the 2, RegisterManipulation/Src directory. 


2. Edit the main.c file: Right-click on the main.c file and choose to open it in a simple text 
editor, such as Notepad. 


3. Update the code: Find the section of code that sets PAS (LED PIN) high. It should look 
like this: 


// 22: Set PA5(LED PIN) high 
GPIOA OD R |= LED PIN; 


Replace this code with the following to toggle the state of PAS and create a blinking effect: 


// 22: Toggle PAS (LED PIN) 
GPIOA OD R “= LED PIN; 


for(int i = 0; i < 100000; i++){} // Delay loop for visible 
blinking 


4. Save the changes: After updating the code, save the main. c file. 
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Let’s go back to our project folder and access the 2. RegisterManipulation/Debug directory 
through the command prompt. This specific folder is important because it’s where STM32CubeIDE 
automatically places the project’s makefile. Understanding the role and structure of makefiles 
is crucial in embedded firmware development, and we will delve into this topic in more detail in 
upcoming chapters. 


We can access the folder through the command prompt in multiple ways: 


Windows users can choose between these methods: 


e Method 1 - using the context menu: Navigate to the 2. RegisterManipulation/Debug 
folder in Windows Explorer. Once there, hold down the Shift key, right-click in an empty space 
within the folder, and select Open Command Window Here. This action will open a Command 
Prompt window directly in the Debug folder. 


e Method 2 - copying the path: Alternatively, click on the address bar in Windows Explorer 
while in the 2_ RegisterManipulation/Debug folder and then copy the folder path. 
Then, open Command Prompt from the Start menu or by typing cmd in the Run dialog (Win 
+ R). In the Command Prompt, type cd (note the space after ‘cd’), paste the copied path, and 
press Enter. This will change the directory to the Debug folder. 


Users of other operating systems can choose between these methods: 


e Navigate to the folder, and open the terminal specific to your operating system. 


e Use the cd (change directory) command, followed by the absolute path to the 2_ 
RegisterManipulation/Debug folder to navigate to it. The exact path may vary, based 
on where the project is located on your system. 


| B Command Prompt 


Microsoft Windows [Version 10.0.19045.3803] 
(c) Microsoft Corporation. All rights reserved. 


C:\Users\Ninsaw>cd C:\Users\Ninsaw\Documents\tempWorkspace\2_RegisterManipulation\Debug 


C: \Users\Ninsaw\Documents\tempWorkspace\2_RegisterManipulation\Debug> 


Figure 3.6: Accessing the Debug folder through the Windows Command Prompt 


In this practical exercise, we'll replicate the commands used by STM32CubeIDE, extracting them 
directly from its console pane. We'll execute them one by one, as depicted in Figure 3.5, starting with 
line number 2 (since our current focus isn’t on makefiles). 


From IDE to the command line - watching the build process unfold 


To do this, follow these steps: 


1. 


Copy line number 2 from the STM32CubeIDE console pane and paste it into the command 
prompt. This line compiles the startup file using arm-none-eabi-gcc, referencing paths 
specific to my system setup: 


C: \Users\Ninsaw\Documents\tempWorkspace\2 RegisterManipulation\ 
Debug>arm-none-eabi-gcc -mcpu=cortex-m4 -g3 -DDEBUG -c -x 
assembler-with-cpp -MMD -MP -MF"Startup/startup stm32f41llretx.d" 
-MT"Startup/startup_ stm32f4llretx.o" --specs=nano.specs 
-mfpu=fpv4-sp-d16 -mfloat-abi=hard -mthumb -o "Startup/startup_ 
stm32f411lretx.o" "../Startup/startup_stm32f411lretx.s" 


Successful execution will not result in any error messages. 


Proceed with line number 3, which compiles the main. c file. However, before pasting this 
command into the command prompt, remove the -f£cyclomatic-complexity flag, as 
it is not supported by some versions of the GNU Toolchain for Arm. Paste the command into 
a text editor, delete the flag, and then copy the modified command into the command prompt. 
The command for line number 3 should look like this: 


C: \Users\Ninsaw\Documents\tempWorkspace\2 RegisterManipulation\ 
Debug>arm-none-eabi-gcc "../Src/main.c" -mcpu=cortex-m4 
-std=gnull -g3 -DDEBUG -DNUCLEO F411RE -DSTM32 -DSTM32F4 
-DSTM32F411RETx -c -I../Inc -00 -ffunction-sections -fdata- 
sections -Wall -fstack-usage -MMD -MP -MF"Src/main.d" -MT"Src/ 
main.o" --specs=nano.specs -mfpu=fpv4-sp-d16 -mfloat-abi=hard 
-mthumb -o "Src/main.o" 


Follow the same procedure for lines 4 and 5 to compile the syscalls.cand system.c 
files, respectively. 


We will proceed to link these files. To do this, copy the linking command, which is on line 
number 6 in the STM32CubeIDE console pane, and paste it into the command prompt, like this: 


c:\Users\Ninsaw\Documents\tempWorkspace\2 RegisterManipulation\ 
Debug>arm-none-eabi-gcc -o "2 RegisterManipulation.elf" 
@"objects.list" -mcpu=cortex-m4 -T"C:\Users\Ninsaw\Documents\ 
tempWorkspace\2 RegisterManipulation\STM32F411RETX FLASH. 

ld" --specs=nosys.specs -Wl,-Map="2 RegisterManipulation.map" 
-W1l,--gc-sections -static --specs=nano.specs -mfpu=fpv4-sp-d16 
-mfloat-abi=hard -mthumb -Wl,--start-group -lc -lm -W1,--end- 
group 


A successful linking process will not result in any error messages. 


We can use the arm-none-eabi-size command to display the size of our output .elf 
file. This will give us insights into the sizes of various sections, such as text, data, and bss. We 
will discuss these sections in detail later in the book, particularly when we delve into writing 
linker scripts. 
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To execute this, copy line number 8 from the STM32CubeIDE console pane and paste it into the 
command prompt. Running the following command: 


c:\Users\Ninsaw\Documents\tempWorkspace\2 RegisterManipulation\ 
Debug>arm-none-eabi-size 2 RegisterManipulation.elf 


It will display the size details of the .e1f file: 


text data bss dec hex filename 


744 8 1568 2320 9190 2_RegisterManipulation.elf 


Figure 3.7: Output produced by executing the arm-none-eabi-size command 


Observing the results, we can confirm that the output matches exactly what is displayed in the 
STM32CubeIDE console pane. 


At this stage, we can choose to convert our .e1f file into the . bin format using the arm-none- 
eabi-objcopy tool, with the appropriate flags. 


Type the following in the command prompt and press Enter: 


arm-none-eabi-objcopy -O binary 2 RegisterManipulation.elf 2_ 
RegisterManipulation.bin 


Let’s break down this snippet: 


e -O binary specifies the output format, which in this case is a binary file 
e 2 RegisterManipulation. elf is the source ELF file you are converting 


e 2 RegisterManipulation.binis the name of the output binary file that will be created 


This final step marks the completion of our first build process. We have successfully compiled and 
linked all necessary files, resulting in the creation of our final executable in two formats. The next 
process involves uploading the firmware to our microcontroller using OpenOCD. This will be covered 
in the next section. 


Uploading firmware to the microcontroller using OpenOCD 


Open On-Chip Debugger (OpenOCD) plays an important role in embedded firmware programming. 
It not only facilitates the transfer of executable firmware code to a microcontroller but also provides 
robust debugging capabilities. In this section, we shall go through a step-by-step process to upload 
the 2. RegisterManipulation executable file into our microcontroller. 


From IDE to the command line - watching the build process unfold 


We will start by locating the correct OpenOCD script for our development board. OpenOCD comes 
with a variety of scripts, each tailored to different microcontrollers and development boards. In our 
case, the focus is on the st_nucleo_f4 series. To find the right script, follow these steps: 


1. Navigate to the OpenOCD installation directory, typically found in the Program Files 
folder for Windows users. The OpenOCD folder is usually named xpack - openocd. Once 
there, enter the openocd subfolder, then the scripts subfolder, and finally, the board 
subfolder. You will find a file named st_nucleo_f4.cfg; this is the OpenOCD file we have 
to execute for our NUCLEO-F4 development board. 


2. To launch OpenOCD, connect your development board, open the command prompt window, 
and enter the following command: 


openocd -f board/st nucleo f4.cfg 


This command starts OpenOCD with the configuration file specific to the STM32 NUCLEO F4 
development board. The configuration file, st_nucleo_f£4.cfg, contains all the necessary 
settings for OpenOCD to communicate with the development board. 


This is a snippet of the output from the command prompt after executing the command: 


Select Command Prompt - openocd -f board/st_nucleo_f4.cfg 


C: \Users\Ninsaw\Documents\tempWorkspace\2_RegisterManipulation\Debug>openocd -f boa 
xPack Open On-Chip Debugger 9.12.0+dev-01312-g18281b@c4-dirty (2023-@9-04-22:32) 
Licensed under GNU GPL v2 
For bug reports, read 
http: //openocd.org/doc/doxygen/bugs .html 
Info : The selected transport took over low-level target control. The results might 
srst only separate srst_nogate srst_open_drain connect_deassert_srst 
Listening on port 6666 for tcl connections 
Listening on port 4444 for telnet connections 
: clock speed 2000 kHz 
: STLINK V2341M27 (API v2) VID:PID 0483:374B 
: Target voltage: 3.267716 
[stm32f4x.cpu] Cortex-M4 r@p1 processor detected 
[stm32f4x.cpu] target has 6 breakpoints, 4 watchpoints 
starting gdb server for stm32f4x.cpu on 3333 
Listening on port 3333 for gdb connections 


Figure 3.8: OpenOCD'’s first output 


The information presented here includes the following: 


e Communication ports: OpenOCD reports that it is listening on port 6666 for Tcl connections 
and port 4444 for Telnet connections. These ports are used to send commands to OpenOCD 
and interact with it during debugging sessions. 
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e Processor and debug capabilities: The debugger has identified the Cortex-M4 r0p1 processor in 
the STM32F4 series microcontroller. Additionally, it notes that the target has six breakpoints and 
four watchpoints, which are crucial for setting breakpoints and watchpoints during debugging. 


¢ GDB server initiation: The last two lines in the output indicate that a GDB server has been 
started for the STM32F4 series CPU on port 3333. This server allows a GDB (GNU Debugger) 
client to connect for debugging purposes. 


With OpenOCD running, the next step involves using the GDB to upload the firmware to the 
microcontroller. Let’s access another command prompt window, still from the Debug folder (as 
OpenOCD should keep running in the first one), and enter the following command to start the GDB: 


arm-none-eabi-gdb 


Once the GDB is open, we establish a connection to the microcontroller by running the following: 


target remote localhost: 3333 


This command connects GDB to the OpenOCD server running on the local machine (localhost) 
on port 3333, which is the default port for OpenOCD. 


Upon executing this command, both command prompt windows return outputs telling us that 
debugging has started: 


BF Command Prompt - arm-none-eabi-gdb 


Copyright (C) 2021 Free Software Foundation, Inc. 
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gp1.html> 
This is free software: you are free to change and redistribute it. 
There is NO WARRANTY, to the extent permitted by law. 
Type “show copying" and “show warranty" for details. 
This GDB was configured as “--host=i686-w64-mingw32 --target=arm-none-eabi". 
Type “show configuration" for configuration details. 
For bug reporting instructions, please see: 
<https: //www.gnu.org/software/gdb/bugs/>. 
Find the GDB manual and other documentation resources online at: 
<http: //www. gnu.org/software/gdb/documentation/>. 


For help, type “help”. 

Type “apropos word" to search for commands related to "word". 

(gdb) target remote localhost :3333 

Remote debugging using localhost: 3333 

warning: No executable has been specified and target does not support 
determining executable automatically. Try using the "file" command. 


in Q 


(gdb) 


Figure 3.9: Output from the command prompt window running the GDB 


From IDE to the command line - watching the build process unfold 


This is the output from the command prompt window running st_nucleo_f4.cfg: 


E Command Prompt - openocd -f board/st_nucleo_f4.cfg 


http: //openocd.org/doc/doxygen/bugs . html 
Info : The selected transport took over low-level target control. The result 
srst only separate srst_nogate srst_open_drain connect_deassert_srst 
Listening on port 6666 for tcl connections 
Listening on port 4444 for telnet connections 
clock speed 2000 kHz 
STLINK V2341M27 (API v2) VID:PID 0483:374B 
Target voltage: 3.264567 
[stm32f4x.cpu] Cortex-M4 r@p1 processor detected 
[stm32#4x.cpu] target has 6 breakpoints, 4 watchpoints 
starting gdb server for stm32f4x.cpu on 3333 
Listening on port 3333 for gdb connections 
accepting 'gdb' connection on tcp/3333 
eae cpu] halted due to debug-request, current mode: Thread 
xPSR: 0x01909090909090 pc: Oxe8Eeee@6bc msp: Ox2Ee01ffce 
Info : device id = 0x10006431 
: flash size = 512 KiB 
: flash size = 512 bytes 
Prefer GDB command “target extended-remote :3333" instead of “target 


Figure 3.10: Output from the command prompt window running st_nucleo_f4.cfg 


Before loading the firmware, we have to reset and initialize the board using the following command: 
monitor reset init 
Let’s break down the command into its components: 


e monitor: This prefix is used in the GDB to indicate that the following command is not a GDB 
command but is meant for the debugging server (in this case, OpenOCD) 


e reset: This part of the command instructs OpenOCD to reset the target device 
e init: This tells OpenOCD to execute its initialization sequence for the target device 
Then, we load the firmware onto the microcontroller using the following command: 


monitor flash write image erase 2 RegisterManipulation.elf 
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This command erases the existing firmware on the microcontroller and writes the new firmware (in 
this case, 2. RegisterManipulation.el1f) onto it. This command erases the existing firmware 
on the microcontroller and writes the new firmware (in this case, 2. RegisterManipulation. 
elf) onto it: 


e flash write_image: This isan OpenOCD command that tells it to write an image to 
the flash memory of the target microcontroller - in effect, programming the microcontroller 
with a new firmware image. 


e erase: This option tells OpenOCD to erase the flash memory before writing the new image. 
Erasing the flash is a common requirement in microcontroller programming, as it clears any 
previous program and ensures that the new firmware is written to a clean memory space. 


After successfully loading the firmware, we reset the board again with the same reset command. 


monitor reset init 


Then, we resume the execution of the code on the microcontroller with the following: 


monitor resume 


Voila! The firmware should now be running on the microcontroller. You should see the LED on the 
board blinking, indicating the successful upload and execution of the new firmware. 


Summary 


In this chapter, we explored the intricacies of the embedded firmware build process, with a specific 
focus on the GNU Toolchain. 


We began by getting to know the embedded build process, exploring its multiple stages - pre-processing, 
compilation, assembly, linking, and locating. Each stage was analyzed, clarifying its significance in 
transforming human-readable source code into executable machine instructions. We delved into the 
roles of pre-processing in preparing code, the nuances of compilation and assembly in translating and 
converting code, and the intricate tasks of linking and locating in forming a cohesive, executable binary. 


Transitioning to practical application, the chapter introduced the GNU Binary Tools for Embedded 
Systems. By revisiting our previously developed bare-metal GPIO driver, we observed the build 
commands executed by the STM32CubelIDE, replicating these steps manually using our command- 
line interface. This approach gave us a deeper appreciation of the underlying processes and commands 
that IDEs automate. In the latter part of the chapter, we went through the step-by-step process of 
uploading our firmware to the microcontroller using OpenOCD, from locating the correct OpenOCD 
script for the development board to executing commands to reset, initialize, and run the firmware on 
the microcontroller. This practical exercise demonstrated the successful application of the theoretical 
knowledge we gained earlier, marking a significant milestone in our journey. 


In the next chapter, we shall learn how to write our own linker scripts and startup files. This important 
step will represent another significant milestone in our journey toward mastering the art of developing 
entirely bare-metal firmware from the ground up. 


4 


Developing the Linker Script 
and Startup File 


In this chapter, we undertake an in-depth exploration of the core components of embedded bare-metal 
programming, focusing on three critical areas: the microcontroller memory model, the writing of 
the linker script, and the startup file. 


First, we'll explore the microcontroller memory model to understand how memory is organized and 
utilized. This knowledge is important for accurately allocating program code and data sections within 
the microcontroller memory. Next, we'll go through the intricacies of writing linker scripts. These scripts 
are essential for correctly mapping our program to the appropriate sections of the microcontroller’s 
memory, ensuring that the executable runs as intended. 


Finally, we will learn about the startup file and then proceed to write our own, focusing on initializing 
the vector table and configuring Reset_Handler. 


In this chapter, were going to cover the following main topics: 
e Understanding the memory model 
e ‘The linker scripts 


e Writing the linker script and startup file 


Technical requirements 


All the code examples for this chapter can be found on GitHub at https: //github.com/ 
Packt Publishing/Bare-Metal -Embedded-C- Programming. 
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Understanding the STM32 memory model 


While the STM32 memory map consists of various memory areas, our primary focus in developing 
the linker script and startup file revolves around two critical areas: flash memory and static random 
access memory (SRAM). These areas are of utmost importance because they are directly involved 
in program storage. In the initial parts of this section, we will learn about the characteristics of these 
memory areas and the distinct roles they play. 


Figure 4.1 shows a section of the stm32f411 memory map, highlighting the flash memory and SRAM. 


0x2002 0001 - Ox3FFF FFFF 
SRAM (128 KB aliased | 0x2000 0000 - 0x2002 0000 

Reserved Ox1FFF C008 - 0x1FFF FFFF 

Option bytes 0x1FFF C000 - 0x1FFF C007 


0x1FFF 7A10 - Ox1FFF BFFF 
Ox1FFF 0000 - 0x1FFF 7A0F 


| Reserved ~=—S— ($0808 0000 - 0x1 FFE FFFF 
Flash memory __| 0x0&00 0000 - 0x0807 FFFF 


0x0008,0000 - 0x07FF FFFF 


Aliased to Flash, 
system, memory or 
SRAM depending, on 
the BOOT pins —_|0x0000 0000 - 0x000 


Figure 4.1: A section of the STM32F11 memory map, highlighting the flash memory and SRAM areas 
Let’s start with flash memory. 


Flash memory 


One of the primary advantages of flash memory is its non-volatile nature. This means that data stored in 
flash memory remains intact even when the power supply is disconnected. In STM32 microcontrollers 
(as well as other microcontrollers), flash memory is typically where the executable code is stored and 
is read-only during normal operation. Flash memory starts at the 0x08000000 address. However, 
its size varies depending on the specific STM32 microcontroller model. 


STM32 microcontrollers come in various series and models, offering a range of flash memory densities 
to accommodate different application requirements. 


Understanding the STM32 memory model 


What is memory density? 


Memory density refers to the concentration of memory storage within a given physical space 
or component. Memory density is often expressed in terms of bits or bytes stored per unit of 
physical area, such as bits per square millimeter or bytes per square centimeter. It measures 
how densely or compactly data can be stored within that space. Higher memory density means 
that we can store more data than we can in a smaller physical space. 


Memory size, on the other hand, refers to the total amount of memory (storage capacity) 
available in a given storage device. It is typically measured in units such as bytes, kilobytes 
(KB), megabytes (MB), gigabytes (GB), etc. 


Ns a 


Now, let’s talk about some operational nuances of flash memory. 


Flash memory, including STM32 flash memory, has a limited number of program and erase cycles. 
Each time we write (program) or erase, it consumes one of these cycles. It is important to consider 
these limitations when designing applications that frequently write to or erase data from flash memory. 


STM32 microcontrollers are known for their low power consumption, and this extends to their flash 
memory operations. Efficient power management ensures that the microcontroller can operate on 
minimal power while reading from or writing to flash memory, making STM32 devices suitable for 
battery-powered applications. 


To ensure data integrity, STM32 flash memory often includes built-in error correction mechanisms. 
These mechanisms help identify and correct errors that may occur during data storage and retrieval, 
enhancing the reliability of the stored firmware. 


The following are some key attributes of the STM32 flash memory: 


e Read-only nature: Primarily used for storing program code 
e Memory start address: 0x08000000 
e Variable size: Dependent on the specific STM32 microcontroller model 


e Vector table location: 0x08000004 


Let's take a look at the other primary memory areas relevant to writing our linker script and startup file. 


SRAM 


SRAM is a type of volatile memory, meaning it loses its contents when the power supply is disconnected. 
It is used in STM32 microcontrollers for temporary data storage during program execution. Unlike 
flash memory, which is used for long-term storage of program code, SRAM is designed for high- 
speed access and low latency, making it ideal for storing variables, intermediate data, and managing 
the stack during runtime. 
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Like flash memory, the STM32 microcontrollers feature varying sizes of SRAM, tailored to the needs 
of different applications. The size of the SRAM determines the amount of runtime data that can be 
handled and affects the overall performance of the microcontroller in handling complex tasks or 
multitasking. The SRAM in STM32 microcontrollers starts at the 0x20000000 address. Like flash 
memory, its size varies depending on the specific STM32 microcontroller model. 


The following are the key attributes of the STM32 SRAM: 


e Read and write: Variables and the stack are stored here 
e Memory start address: 0x20000000 


e Variable size: Dependent on the specific STM32 microcontroller model 


Before moving on to introduce the linker script, let’s touch on one other memory area that is not 
relevant to the linker script, but is still relevant to our understanding of the memory layout of 
our microcontroller. 


Peripheral memory 


Peripheral memory is dedicated to managing and interfacing with the microcontrollers onboard 
peripherals. These peripherals include components such as timers, communication interfaces (UART, 
SPI, 12C), and analog-to-digital converters (ADCs). Peripheral memory is made up of registers that 
are used to configure and manage these peripherals. 


An important aspect of microcontroller architecture is the use of memory-mapped input/output 
(I/O). Memory-mapped I/O is a technique where peripheral registers are assigned specific addresses 
in the system’s memory space. This approach allows firmware to interact with hardware peripherals by 
reading from or writing to these memory addresses, just as it would with regular memory. The peripheral 
memory area of the STM32 memory map is the memory-mapped area for the peripheral registers. 


Now that we are familiar with the major memory areas, we are ready to learn about the linker script. 


The linker script 


Linker scripts play an important role in the build process, especially in defining the memory layout and 
allocating various memory sections used by the firmware. They specify where different sections of the 
firmware, such as code, data, and uninitialized data, are to be placed in the microcontroller’s memory. 


While linker scripts set up the structure and boundaries for these sections, it is important to note 
that they do not populate these sections with data. The actual process of initializing data with specific 
values is handled by the startup code, which runs when the microcontroller boots up. We provide these 
linker scripts to the linker to effectively guide the organization of memory during the linking phase. 


The linker script 


sen | Cr] M (ene) 


` fe Preprocessed files Asse! r files 
Source files Preprocessed files Assembly files 


5 Relocatable 
(Caesemier | B Cimer |) atabe | (Croon) | pecutabie 


Object files 


Figure 4.2: The build process with the linker highlighted 


Understanding the linking process 


In the build process, the linking of object files is an important step that transforms individual pieces of 
code into functional firmware. The assembler generates object files from source code, each containing 
code and data sections necessary for the firmware. However, these object files often have unresolved 
internal references to variables and functions, making them incomplete on their own. For instance, an 
object file may contain a reference to an adc_value variable that is defined elsewhere. It is the linker’s 
job to amalgamate these object files, systematically resolving all such unresolved symbols to create a 
cohesive output file. To fully appreciate the meticulous work of the linker, we have to understand the 
attributes assigned to each section by the linker. 


Section attributes and their implications 


Each section within an object file is identified by a unique name and size, with specific attributes that 
dictate how they should be treated: 


e Loadable sections: These sections contain content that must be loaded into memory at 
runtime. They are essential for the execution of the program and include executable code and 
initialized data. 


e Allocatable sections: These sections do not carry content by themselves. Instead, they signal 
that a certain area of memory should be reserved, typically for uninitialized data that will be 
defined at runtime. 


e Non-loadable, non-allocatable sections: Often, a section that is neither loadable nor allocatable 
contains debugging information or metadata that helps in the development process but is not 
required for the program’s execution. 
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A crucial aspect of the linking process is the determination of two types of addresses for each allocatable 
and loadable output section: the virtual memory address (VMA) and the load memory address (LMA). 


These are the roles of these two addresses: 


e VMA: This address represents where the section will reside in memory during the execution 
of the output file. It is the runtime address used by the system to access the section’s data 
or instructions. 


e LMA: Conversely, the LMA is the address where the section is physically loaded into memory. 


In most scenarios, the VMA and LMA are identical 


A notable exception occurs when a data section is initially loaded into flash memory but then 
copied to SRAM upon startup. 


To provide a clearer and more comprehensive understanding of the latter stages of the build process, 
it’s essential to delve into another fundamental aspect of our discussion: the specific responsibilities 
and contributions of the locator within the build process. 


Address relocation and the locator 


The output file produced by the linker is not immediately suitable for use on a target microcontroller. 
This is because the addresses assigned to different sections during the linking process do not necessarily 
correspond to the actual memory layout of the target device. Therefore, these addresses must be 
relocated to match the target’s memory space accurately. This is the job of the locator. 


Source files Preprocessed files Assembly files 


5 Relocatable 
e) D Cimer | rable! (rar) | ecuatie 


Object files 


Figure 4.3: The build process, highlighting the relationship between the 
relocatable file, the locator, and the final executable output 


The linker script 


In the GNU toolchain, the locator functionality is integrated into the linker, streamlining the process 
of address relocation. This capability ensures that the final executable is correctly mapped to the 
microcontrollers memory, making it ready for execution. 


In this section, we examined the build process. From this, we observed that the process of linking 
object files in embedded systems development involves meticulous organization of code and data 
sections, symbol resolution, and address relocation. 


In the next section, we shall explore the key components of the linker script in detail. This exploration 
will offer additional insights and deepen our understanding of the core elements discussed in this section. 


Key components of the linker script 


The key components of a linker script include the memory layout, section definitions, options, and 
symbols, each playing a unique role in ensuring that the firmware is correctly placed and executed 
within the microcontroller’s memory. 


Memory layout 


This part of the linker script specifies the various memory types available in the microcontroller, such 
as flash memory and SRAM. It includes their start addresses and sizes, for instance, flash starting at 
0x08000000 or SRAM at 0x20000000. 


Section definitions 


A critical aspect of the linker script is defining how and where different sections of the program are 
placed. The . text section, containing the program code, is usually positioned at the beginning of 
flash memory. Following this, the Block Started by Symbol (.bss) and . data sections are allocated 
in SRAM. The linker script also ensures proper alignment of these sections for efficient memory access 
and program execution: 


e „text: 


" Purpose: The . text section holds the executable instructions of our program. It’s where 
the actual code that the processor executes resides. 


* Characteristics: This section is read-only and typically resides in the microcontroller’s flash 
memory, ensuring that the program code is preserved even when the device is powered off. 


* Size and location: The size of the . text section varies based on the amount of code in 
your program. In STM32 microcontrollers, it generally starts at a predefined memory 
address, often in the lower region of the flash memory. For example, 0x00000000 and 
then relocated to 0x08000000. 
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e „bss: 


" Purpose: The .bss section is used for uninitialized global and static variables. Variables 
in this section don't have initial values when the program starts. 


* Characteristics: This section is in the SRAM and is also read-write. It’s typically 0-initialized 
at startup, meaning all variables start with a value of 0. 


e „data: 


* Purpose: The . data section contains initialized global and static variables. Unlike variables 
in . bss, these variables have initial values specified in our code. 


* Characteristics: It’s a read-write section. At runtime, the values in the . data section are 
typically copied from flash memory to SRAM to allow faster access and modification. 


* Management: The process of copying these values from flash memory to SRAM is handled 
by the startup code, executed before the main function of our program. 


e Other sections: 


* „rodata: This section is used for constant data, such as string literals and constant arrays. 
It’s read-only and usually stored in flash memory. 


* heap and .stack are sections used for dynamic memory allocation (malloc, free) 
and function call stacks, respectively. They are part of SRAM and are crucial for runtime 
memory management. 


The following table summarizes the key sections and their placement in memory. 


Section Purpose Placed in 

-text Holds executable program instructions. FLASH 

.bss Holds uninitialized global/static variables. SRAM 

.data Holds initialized global/static variables with initial values. | FLASH (SRAM at runtime) 
.rodata | Holds constant data (string literals, constant arrays). FLASH 


Table 4.1: Linker script sections and their placement in memory 


Understanding the characteristics of these sections is important for having a properly 
functioning executable. 


The linker script 


Options and symbols 


Options in linker scripts are commands or directives that influence the behavior of the linker. A 
typical linker script includes directives for setting the entry point of the program and directives for 
defining the memory layout. 


Symbols in linker scripts are identifiers that act as placeholders or references to specific memory 
locations, values, or addresses within the microcontroller’s memory space. Symbols can be used to 
represent the start or end addresses of memory sections or specific variables in the program. For example, 
a symbol might be defined to represent the beginning of the flash memory or the start of the SRAM 
region. We can also use symbols to define important constants or values that are used throughout the 
firmware (such as source code files). These might include hardware addresses, configuration values, 
or size limits. By using symbols, the code becomes more readable and maintainable, as these values 
can be changed in one place (the linker script), rather than in multiple locations throughout the code. 


Now that we are familiar with the key components of linker scripts, we will proceed to learn about 
some of the essential directives within these scripts. Each directive in a linker script instructs the linker 
on how to process and organize the input object files into the final executable. 


Linker script directives 


In this section, we learn about the essential directives of linker scripts. These directives dictate the 
memory layout and how various sections—code, data, and others—are allocated within the target 
microcontroller’s memory. We will explore the key directives, their functionality, and how they influence 
the overall structure and efficiency of the compiled firmware. Let’s start with the MEMORY directive. 


Memory directive (MEMORY) 


The MEMORY directive delineates the microcontroller’s memory regions. Each defined block within 
the MEMORY section represents a distinct area of memory, characterized by its name, start address, 
and size. This directive allows us to define the memory layout of the target device, specifying different 
memory regions and their attributes. It plays an important role in guiding the linker on how to allocate 
sections of the program (code, data, etc.) across the microcontroller’s physical memory. 


Usage template 
The general syntax for the MEMORY directive is as follows: 


MEMORY 


{ 


name (attributes) : ORIGIN = origin, LENGTH = length 
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name: An identifier we give to the memory region 


attributes: This specifies the access permissions for the region, such as read, write, and 
execute permissions 


ORIGIN: This defines the start address of the memory region 


LENGTH: This specifies the size of the memory region 


Usage example 


Consider a microcontroller with flash memory for storing executable code and SRAM for data storage. 
A linker script might define these memory regions as follows: 


MEMORY 
FLASH (rx) : ORIGIN = 0x08000000, LENGTH = 256K 
SRAM (rwx) : ORIGIN = 0x20000000, LENGTH = 64K 


In this example, two memory regions are defined: FLASH and SRAM: 


FLASH is marked with read (r) and execute (x) permissions (rx), indicating that this region can 
store executable code but is not writable during program execution. It starts at the 0x08000000 
address and extends for 256K bytes. 


SRAM is given read (r), write (w), and execute (x) permissions (rwx), allowing it to store 
data and executable code that can be modified during runtime. It begins at the 0x20000000 
address and extends for 64K bytes. 


The MEMORY directive, with its comprehensive definition of memory regions and attributes, lays the 
foundation for efficient and effective memory management in firmware development. Before moving 
on to the next directive, lets examine all the attributes that can be specified to detail the characteristics 
and permissions of the memory sections: 


r: This attribute allows memory to be read. It is important for sections of memory containing 
executable code or constants that the program needs to read during execution. 


w: This attribute permits data to be written to the memory. It is important for memory areas 
where the program stores data dynamically during execution. 


x: This attribute allows the execution of code from the specified memory region. It is typically 
assigned to flash memory where the program code resides. 


rw: This is a combination of read and write permissions, allowing both operations in the 
specified memory region. It’s commonly used for sections such as SRAM where temporary 
data and variables are stored and modified. 
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e xx: This combines read and execute permissions. It’s often used for flash memory to indicate 
that the region contains executable code that the processor can read and execute. 


e rwx: This attribute combines all three permissions, making the memory region fully accessible 
for reading, writing, and executing. This is less commonly used due to security and system 
stability considerations but might be applicable in certain development or debug scenarios. 


e empty: Ifno attribute is specified, the memory region does not grant any access permissions 
by default. This might be used in special cases where permissions are controlled or modified 
by other means within the firmware. 


Now, let’s examine the ENTRY directive. 
The entry directive (ENTRY) 


This directive specifies the entry point of the program, which is the first piece of code to execute 
upon reset. 


Here is the usage template: 


ENTRY (SymbolName) 


Here is a usage example: 


ENTRY (Reset_Handler) 


In this example, Reset _Hand1er is designated as the entry point of the program, meaning, the 
first function to execute. In firmware development, Reset_Handler takes care of initializing the 
system and jumping to the main program. 


Next, we have the SECTIONS directive. 
The sections directive (SECTIONS) 


This directive defines the mapping and ordering of sections from input files into the output file. 


Usage example 
Let’s see a template of it: 


SECTIONS 


{ 


.output_section_name address 
input_section_information 

} >memory region [AT>load_address] [ALIGN(expression)] [:phdr_ 
expression] [=fill_ expression] 
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The parameters are as follows: 


output _section_name: This is the name given to the output section being defined. 
Common names include . text for executable code, . data for initialized data, and .bss 
for uninitialized data. 


address: This is optional and specifies the start address of the section in memory. This is 
often left to the linker to determine, based on the order of sections and memory regions defined 
in the script. 


input_section_information: This determines which input sections (from the compiled 
object files) should be included in this output section. Wildcards such as * (. text) can be 
used to include all . text sections from all input files. 


>memory_region: This assigns the section to a specific memory region defined in the 
MEMORY block of the linker script. We use this to tell the linker where in the target’s memory 
map this section should reside, for example, FLASH or SRAM. 


[AT>load_address]: This is optional and specifies the load address of the section. This is 
used in scenarios where the execution address differs from the load address. 


[ALIGN (expression) ]: This is optional and aligns the start of the section to an address 
that is a multiple of the value specified by expression. This is particularly useful for ensuring 
that sections begin at addresses that meet specific alignment requirements, which can enhance 
access speed and compatibility. 


[:phdr_expression]: This is optional and associates the section with a program header. 
Program headers are part of the Executable and Linkable Format (ELF) file structure; they 
provide the system loader with information about how to load and run different segments of 
a program. 


[=f£i11_ expression]: This is optional and specifies a byte value to fill gaps between 
sections or at the end of sections to reach a certain alignment. This can be useful for initializing 
memory regions to a known state. 


Usage example 


Let’s see an example of the SECTIONS directive in action: 


SECTIONS 


{ 


} 


.text 0x08000000 


{ 


*(. text) 
} >FLASH 


The linker script 


In this example, we have the following: 


e SECTIONS: This keyword begins the section of the linker script where output sections are 
defined. Output sections are areas of memory that hold the code and data from the input files 
being linked. 


e „text 0x08000000: This line defines an output section named . text and sets its starting 
address to 0x08000000. The . text section typically contains executable code. 


e { *(.text) }: This line specifies what goes into the . text output section. The * (. text) 
syntax means all . text sections from all input files. 


e >FLASH: This directive tells the linker to place the . text section in a memory region named 
FLASH. The FLASH region will be defined in the MEMORY directive block. 


To understand the importance of the * (. text) syntax, let’s examine the process of merging sections. 
Sections merging 


As we learned earlier, the assembler generates an object file for each source file, with each containing 
its . text, .data, . bss, and other sections. These sections from all object files are then merged by 
the linker into unified . text, . data, and .bss sections for the final executable. 


Consider a firmware project with two source files, main.c and delay . c. The assembly process 
yields main.o and delay. o, each with its own sections. The linker’s task is to consolidate these 
into a single set of sections for the final executable. 


The following figure depicts this process. Note that merging is not performed through an addition 
process; this is merely a visual aid to enhance your understanding. 


.text T .text 
.data + .data | 
mre | main.o |- - „bss + bss L -| _delay.o |] ra 
.rodata + .rodata | 
comment + comment | 
II 
text 
.data 
.rodata 
.comment 


Figure 4.4: The merging process involving two source files: main.c and delay.c, 
resulting in the production of the final executable, final.elf 
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Now, let’s explore the purpose of the AT > directive. To do this, it is essential to revisit the concepts 
of the LMA and VMA. 


A closer look at the LMA and VMA 


As we learned earlier, each allocatable and loadable output section in a binary output file is associated with 
two types of addresses: the LMA and the VMA. These addresses are crucial for defining how and where 
a section of the binary is processed during the system’s startup and its subsequent runtime operations: 


e LMA: This is the physical address in the binary image where the section is stored before program 
execution begins. It determines from where the system will load the section into memory when 
the program starts. 


e VMA: In contrast, the VMA is the address where the section is intended to be accessed during 
the program’s execution. This is the “runtime” address used by the system to refer to data or 
instructions in that section. For systems, particularly microcontrollers, that do not employ a 
memory management unit (MMU), the VMA usually matches the section’s physical memory 
address directly. 


Why are LMA and VMA important? 


The distinction between LMA and VMA allows for a flexible memory management approach where 
data can be stored in one location (such as flash memory) but run from another (such as SRAM). For 
example, initialized global and static variables (typically placed in the . data section) can be stored in 
flash memory but need to be copied to SRAM for faster access and to allow modification at runtime. 


To fully understand this, let’s consider the following snippet from a linker script generated by 
the STM32CubeIDE: 


-data 
{ 
= ALIGN(4) ; 
_sdata = .; /* create a global symbol at data start */ 
amaata) /* .data sections */ 
*(.data*) /* .data* sections */ 
* (,RamFunc) /* .RamFunc sections */ 
* (,RamFunc* ) /* .RamFunc* sections */ 
= ALIGN (4) ; 
_edata = .; /* define a global symbol at data end */ 


} >SRAM AT> FLASH 


The linker script 


In this script, the last line, >SRAM AT> FLASH, incorporates two important directives: 


e >SRAM indicates that the output . data section is placed in the SRAM section of the memory 
during program execution (VMA). 


e AT> FLASH specifies that although the section resides in SRAM when executed, it should 
initially be loaded into memory (LMA) at the corresponding address in FLASH. This is common 
for initialized data, which is stored in flash memory and then copied to SRAM at startup by 
the microcontroller’s initialization code. 


This detailed management of memory addresses highlights the critical role of LMA and VMA in 
maximizing the efficiency of resource-constrained microcontrollers. Through the effective use of LMA 
and VMA, we can ensure that even with limited memory resources, our microcontrollers operate 
reliably and efficiently, optimizing both storage and execution efficiency. 


Before moving to explore the other features of the linker script, let’s familiarize ourselves with some 
other commonly used directives. 


Other commonly used directives 


Some other commonly used directives include the KEEP, ALIGN, PROVIDE, >region, and AT 
directives. Lets examine them. 


The KEEP directive 


The KEEP directive ensures that specified sections or symbols are not eliminated by the linker during 
the optimization process, even if they appear unused. This is crucial for interrupt vector tables and 
initialization functions that must be present in the final binary. 


Here is the usage template: 


KEEP (section) 
Here is a usage example: 
KEEP (*(.isr vector) ) 
In this example, we are keeping the interrupt vector section. Next, let’s see the region placement directive. 


The >region directive 


The (>region) region placement directive tells the linker to place a particular section into a specific 
memory region. The available memory regions must be defined in the MEMORY directive block of 
the linker script. 
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Here is the usage template: 


section >region 


Here is a usage example: 


.data 


{ 


*(.data) 
} >SRAM 


In this example, we are placing the . data section in the SRAM memory region. 


The ALIGN directive 


The ALIGN directive plays a crucial role in the linker script by adjusting the location counter to align 
with specified memory boundaries. The location counter tracks the current memory address allocated 
by the linker for placing sections or parts of the output file during linking. 


The location counter is a built-in variable that represents the current address in memory where the 
linker is placing sections or parts of the output file during the linking process. It is denoted by a dot 
(.) in linker scripts. 


As the linker processes the script, it assigns memory addresses to code and data sections according to 
the script’s directives, with the location counter monitoring the progress. To ensure efficient memory 
access and adherence to hardware architecture requirements, sections and variables often need to be 
aligned to specific boundaries. The ALIGN directive allows us to achieve this by rounding up the location 
counter to the nearest address that matches the specified alignment, which must be a power of two. 


Here is the usage template: 


= ALIGN (expression); 


Here is a usage example: 


= ALIGN(4) ; 


In this example, we are aligning the current location to a 4-byte boundary. 


Next, let’s see the PROVIDE directive. 


The PROVIDE directive 


The PROVIDE directive allows us to define symbols that the linker will include in the output file if 
they are not already defined. This can be used to set default values for symbols that may be optionally 
overridden by other modules. 


The linker script 


Here is the usage template: 


PROVIDE(symbol = expression) ; 


Here is a usage example: 


PROVIDE(_ stack end = ORIGIN(RAM) + LENGTH (RAM) ) ; 


In this example, we are providing a default stack end address. 


Next, we have the AT directive. 


AT Directive 


The AT directive specifies LMA for a section when it needs to be different from the sections VMA. 
This is commonly used for sections that need to be loaded into a different memory area during 
initialization before being moved to their runtime location. 


Here is the usage template: 


section AT> lma_region 


Here is a usage example: 


.data : AT> FLASH 


{ 


*(.data) 
} >SRAM 


In this example, the . data section is intended to reside in SRAM during the program’s execution. 
However, it is initially loaded from FLASH, as indicated by AT> FLASH. 


In the next section, we will explore another key aspect of linker scripts: the expression of 
numerical constants. 


Understanding constants in linker scripts 


When writing our linker script, we must keep in mind the interpretation of numerical prefixes and 
suffixes by the linker. 


Firstly, let’s clarify how the linker perceives integers with specific prefixes. An integer prefixed with 0 is 
read as an octal number by the linker. On the other hand, an integer starting with 0x is recognized as 
a hexadecimal value. This distinction is important for accurately defining memory addresses and sizes. 
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The use of the K and M suffixes introduces another layer of convenience, allowing us to denote large 
numbers succinctly. The K suffix multiplies the preceding number by 1024, while M expands the 
number by 1,024 twice over. Therefore, 4K translates to 4 times 1024, and 4M expands to 4 times 
1,024 squared. 


To put these principles into practice, let’s explore an example that showcases the versatility of these 
notations. Imagine you need to specify a memory size of 4K. You could straightforwardly use 4K, or 
opt for its decimal equivalent, 4096, which results from multiplying 1,024 by 4. Alternatively, this 
quantity can be expressed in hexadecimal form as 0x1000. 


Table 4.2 summarizes the key points to remember about using constants in linker scripts. It highlights 
the prefixes and suffixes that modify the base value, which offers a clear reference for interpreting and 
using these notations effectively in your linker script. 


Notation | Meaning Example | Equivalent | Hexadecimal 
Decimal Notation 

0 Octal prefix 010 8 2 

Ox Hexadecimal prefix 0x10 16 F 

K Multiplies by 1,024 4K 4096 0x1000 

M Multiplies by 1,024 twice (squared) | 4M 4194304 0x400000 


Table 4.2: Examples of linker script numerical prefixes and suffixes 


In the next section, we shall learn about linker script symbols, further enhancing our understanding 
of linker scripts. 


Linker script symbols 


Linker symbols, also known simply as symbols, are fundamental elements in the process of converting 
source code into executable programs. At its core, a linker symbol comprises two essential components: 
a name and a value. These symbols are assigned integer values, representing memory addresses where 
variables, functions, or other program elements are stored in the microcontrollers memory. 


Previously, we learned that after the assembly stage, the source code is transformed into object files. 
These object files contain machine code and unresolved references to variables and functions. The 
linker’s primary task is to merge these object files, resolve these unresolved symbols, and generate a 
complete executable file ready for execution. 


In the context of linker symbols, the value assigned to a symbol represents the memory address where 
the corresponding variable or function resides. 
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For example: X = 3500 means the memory address of X is 3500 


A symbol named X might be assigned a value of 3500, indicating its memory address. It’s 
crucial to note that in contrast to the variable’s value in the source code, the X linker symbol 
represents its memory address. 


Name | Type Value (Memory | Description 
Address) 

X Symbol | 3500 Represents the memory address where an X variable 
is stored. 

T Symbol | 0x3000 Represents the memory address where a Y variable 
is stored. 

foo() | Symbol | 0x4000 Represents the memory address where a foo () function 
is located. 

bar() | Symbol | 0x5000 Represents the memory address where a bar () function 
is located. 

x Variable | 3500 Represents the value of a C variable named x. 

Y Variable | 4500 Represents the value of a C variable named y. 


Table 4.3: Comparison of linker symbols and C source code variables assignments 


Linker symbols can undergo various operations, such as those we use in C assignments. These operations 
include straightforward assignment (=), addition (+=), and subtraction (- =), among others. 


r 


+= < ) ; 
/= ; 
<<= r 
>>= r 
&= r 


Figure 4.5: Examples of linker symbol operations 


During the linking process, a symbol table is created, mapping each symbol to its corresponding 
address in memory. This table serves as a crucial reference for the linker to resolve symbol references 
and ensure proper linking of program components. 
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Let’s consider a scenario where we have a main . file and at the top of this file; we declare a variable 
named X, assigning it a value of 568. Additionally, within this file, there’s a function named blink. 
Inside the blink function, there are operations to turn on an LED, wait, and then turn it off. This 
is depicted in Figure 4.6. 


Now, let’s take this main. c file and pass it through the build process to generate the main. o object 
file. During this process, a symbol table is generated. Each symbol in this table is associated with 
an address. 


For instance, the X symbol would be assigned an address of 0x20000000, and similarly, the blink 
function would also be assigned its address. In the following figure, the blink function is assigned 
the 0x08000000 address. 


Essentially, just like in the C programming language, each variable has its value. In the object file, 
each symbol has its value, which essentially represents the address of the corresponding variable or 
function in C. 


So, when referring to X in the object file, it wouldn't give us 568; rather, it would provide the address 
of X. This process of assigning values to symbols and associating them with addresses constructs the 
symbol table. 


main.c main.o 


int x = 568; 


void blink (void) Symbol table 


Build Process > 
{ Symbol address 
x 0x20000000 
blink 0x08000000 
} Š r 
e e 
e a 


Figure 4.6: Representation of functions and variables from the source 
file in the symbol table of the output object file 


In this section, we delved deep into linker scripts, highlighting key components and directives. We 
carefully explored each directive, providing practical usage examples. Additionally, we distinguished 
between the LMA and VMA and also emphasized their important roles in guiding the linker on how 
to place sections. In the next section, we will learn how to write our own linker script and startup 
file from scratch, equipping you with another important skill in bare-metal firmware development. 


Writing the linker script and startup file 


Writing the linker script and startup file 


Now that we have a good understanding of linker scripts and their essential components, were 
prepared to write our own. However, before diving into writing the script, it’s crucial to revisit the 
memory map of the microcontroller and gain insight into the positions load memory for various 
sections within the object file. 


Understanding the load memory of different sections 


As discussed earlier in this chapter, the output object file is structured into sections such as . data, 
.rodata, .text, and .bss. Together with the sections created by the assembler, we must define 
our own section to accommodate the vector table for interrupt service routines (ISRs). We will name 
this section .isr_ vector tbl. 


Each of these sections plays an important role in organizing the memory layout of the microcontroller, 
contributing to the functionality and efficiency of the final executable. 


Figure 4.7 shows a zoomed-in view of the flash memory area, showing the required order for placing 
the different sections within the flash memory. Each section represents the combination of identical 
sections from all input files. For instance, the . text section depicted is a unified . text section, 
formed by merging all . text sections from the input files. 


The diagram indicates that the placement must start with the .isr_vector_tbl1 section at the 
beginning of the flash memory. Following this, we must place the . text section, then the . rodata 
section, and finally, the . data section. The diagram does not show the placement of the .bss 
section, as we will place the . bss section directly in the SRAM. Additionally, during the startup code 
implementation, we must copy the content of the . data section from the flash memory to the SRAM. 


SRAM 
¥ | .data — „data output section 
Flash Memory a 
.rodata 
text — „text output section 
-isr_vector tbl 


Figure 4.7: The flash memory area showing the order in which sections should be placed 
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Before we proceed, however, let’s understand the concept of interrupts and the vector table. 


Interrupts and the vector table 


Interrupts are a fundamental concept in computing. They act as a powerful mechanism for managing 
how a computer or a microcontroller handles tasks and responds to external and internal events. 


At its core, an interrupt is a signal to the processor from a hardware device or an internal software 
condition that temporarily halts the current operations. This signal indicates that immediate attention 
is required. When the processor receives an interrupt, it pauses its current tasks, saves its state, and 
executes a function known as an ISR to address the interrupt. Upon completing the ISR, the processor 
resumes its previous tasks, ensuring that critical signals receive prompt and efficient handling. 


What are the types of interrupts? 


We can broadly classify interrupts into two categories: hardware interrupts and software interrupts: 


e Hardware interrupts: These originate from external devices, such as switches, network adapters, 
or any peripheral that needs to communicate with the processor. For example, pressing a push 
button may trigger a hardware interrupt that informs the processor to start a motor. 


e Software interrupts: Unlike hardware interrupts, software interrupts are triggered by software 
instructions. These are used by programs to interrupt the current process flow and execute a 
specific routine. 


What is the role of the interrupt vector table? 


The interrupt vector table serves as an essential lookup table, guiding the processor to the correct ISR 
for each interrupt. An ISR is simply a function designed to address and manage the specific needs 
triggered by an interrupt. The table itself is organized as an array of pointers, with each pointer 
directing the system to the designated ISR for a given interrupt. Upon the occurrence of an interrupt, 
the system references this table to locate the exact memory address of the ISR required for handling 
the interrupt. This efficient mechanism enables the system to promptly respond to various events, 
such as external inputs, timer expirations, and changes in the internal state. 


With this in mind, we are finally ready to write our linker script. 


Writing the linker script 


In our workspace folder, lets make a new folder named 3. LinkerscriptAndStartup. In this 
folder, create a file called stm32_1s.1dand make sure its extension is . 1d. If you're using Windows 
and it asks if you really want to change the file extension, click Yes. Then, right-click the file and open 
it with a basic text editor such as Notepad++. 


Writing the linker script and startup file 


Our objectives with the linker script can be summarized as follows: 


e Specifying the firmware’s entry point 
e Detailing the available memory 
e Specifying the necessary heap and stack sizes 


e Defining output sections 


This is our complete linker script, the contents of the stm32_1s.1d file: 


/*Specifying the firmware's entry point*/ 
ENTRY (Reset Handler) 


/*Detailing the available memory*/ 

MEMORY 
FLASH (rx) :ORIGIN =0x08000000,LENGTH =512K 
SRAM (rwx) :ORIGIN =0x20000000,LENGTH =128K 


_estack = ORIGIN (SRAM) +LENGTH (SRAM) ; 
/*Specifying the necessary heap and stack sizes*/ 


__max_ heap size = 0x200; 
__max_stack_size = 0x400; 


/*Defining output sections*/ 
SECTIONS 


{ 


-eexe 
{ 
= ALIGN (4) ; 
e (( -aliswe Wyeeicione “icloyll,) 
m2 ( CEE) 
*(.rodata) 
= ALIGN (4); 
Geese = of 
} >FLASH 
.data 
{ 
= ALIGN (4); 
usdata = s} 
*(.data) 
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= ALIGN(4) ; 
mecdaicake—ars, 
} > SRAM AT> FLASH /*>(vma) AT> (lma)*/ 
-bss 


Let’s break it down. 
Specifying the firmware’s entry point 
ENTRY (Reset _Handler) 


As we learned earlier, the ENTRY directive specifies the entry point of the firmware, which is the first 
piece of code that gets executed when the firmware starts. In this case, the entry point is the function 
named Reset_Handler. We shall implement this function in the startup file. 


Detailing the available memory 


MEMORY 


{ 


FLASH (rx) :ORIGIN =0x08000000,LENGTH =512K 
SRAM (rwx) :ORIGIN =0x20000000,LENGTH =128K 


} 


Our script specifies two memory regions: FLASH and SRAM. The FLASH memory, with read and 
execute permissions (rx), starts at the 0x08000000 address and has a length of 512K. The SRAM 
memory, with read, write, and execute permissions (rwx), starts at the 0x20000000 address and 
has a length of 128K. 


Symbol creation 


_estack = ORIGIN (SRAM) +LENGTH (SRAM) ; 


Over here, we create a symbol called estack and we set it to the end of the SRAM memory region. 
We will use this symbol to initialize the stack pointer. 
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The stack pointer is an important register within our microcontroller that keeps track of the top 
of the stack. When a function is called, the address of the next instruction (return address) and the 
function's local variables are pushed onto the stack. The stack pointer is then adjusted to reflect the 
addition of these new items. Conversely, when a function returns, the stack pointer helps to pop the 
return address and the local variables off the stack, reverting it to its state before the function is called. 
The stack grows downward (from high memory to low memory addresses in most architectures), so 
setting the stack pointer to the end of SRAM ensures that it starts at the maximum available address, 
utilizing the SRAM space efficiently for stack operations. 


The next lines of code in our linker script specify the heap and stack sizes. 
Specifying the necessary heap and stack sizes 


__max_ heap _size = 0x200; 
__max_stack_size = 0x400; 


These lines define the maximum sizes for the heap (0x200 bytes) and stack (0x4 00 bytes). These 
sizes are important for dynamic memory allocation and function call management, respectively. 


The next segment defines the output sections. 
Defining output sections 
In this section, we will go through the output sections. 


The .text output section 


This segment of our linker script shows the . text output section: 


a CELSE 
= ALIGN (4); 
*(.isr vector tbl) /*merge all .isr_ vector _tbl sections of input 
files*/ 
*(.text) /*merge all .text sections of input files*/ 


*(.rodata) /*merge all .rodata sections of input files*/ 
= ALIGN(4); 
_etext = .; /*Create a global symbol to hold end of text section*/ 
} >FLASH 


Let’s break it down: 


e . = ALIGN(4);: 


This directive aligns the start of the . text section on a 4-byte boundary. This enhances 
memory access efficiency, which is a critical consideration for processors fetching instructions 
in word-sized chunks. 
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*(.isr_ vector tbl): 


This directive pulls in all sections named .isr_vector_tbl from the input files into the 
current location in the . text section. 


*(.text): 


This directive pulls in all sections named . text from the input files into the current location 
in the . text section. 


*(.rodata): 


This directive pulls in all sections named . rodata from the input files into the current location 
in the . text section. 


= ALIGN (4) ;: 


Again, this line ensures that the end of the section is aligned to a 4-byte boundary. Over here, we 
use it to align the end of a section, ensuring that the next section starts on an aligned boundary. 


etext = .;: 


Over here, we define a symbol called etext at the current location. This symbol marks 
the end of the . text section. We will use this symbol as a pointer to the end of the . text 
section in our startup file. 


} >FLASH: 


This directive specifies that the . text section should be placed in the FLASH memory segment 
as defined earlier in the MEMORY block of the linker script. 


This segment shows the . data output section: 


.data 
{ 
= ALIGN(4); 
sdara = 2; /*Create a global symbol to hold start of data 
section*/ 
*(.data) 
= ALIGN(4); 
_edata = .; /*Create a global symbol to hold end of data 
section*/ 
} > SRAM AT> FLASH /*>(VMA) AT> (LMA) */ 
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Let’s break it down: 


° = ALIGN (4) ;: 
This directive aligns the start of the . data section on a 4-byte boundary. 
e _sdata = .;: 


Over here, we create a symbol named _ sdata to represent the start of the . data section by 
setting it to the current location counter. We will use this symbol as a pointer to the start of 
the . data section in our startup file. 


e *(.data): 


This directive pulls in all sections named . data from the input files into the current location 
in the . data section. 


e . = ALIGN(4) ;: 
This line ensures that the end of the section is aligned to a 4-byte boundary. 
e _edata= .;: 


Similar to what we have done previously, we create a symbol named _edata to represent the 
end of the . data section by setting it to the current location counter. We will use this symbol 
in our startup file. 


e > SRAM AT> FLASH: 


This directive specifies the LMA and the VMA of the . data section. 


> SRAM indicates that the section should be located in SRAM, allowing read and write access 
at runtime. 


AT> FLASH tells the linker that although the section is placed in SRAM for execution, its 
initial values should be stored in FLASH. 


This segment shows the . bss output section: 
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This is the final output section of our linker script. 


As we learned earlier, the . bss section holds uninitialized global and static variables that we will be 
initialized to zero in our startup file. This zero-initialization ensures that all variables in this section 
begin with a known state, contributing to our firmware’s stability and predictability. 


Similar to other sections, we begin by aligning the section to a 4-byte boundary for efficient memory 
access, and then we define the _sbssand_ebss symbols to mark the start and end of the section, 
respectively. These symbols facilitate the calculation of the section’s size and its initialization process. 
Finally, we place the section in the SRAM, emphasizing that, although it doesn’t occupy space in the 
binary file on disk, it requires runtime allocation in memory. 


With our linker script finalized, we'll move on to implementing the startup file. This shall be the focus 
of the next section. 


Writing the startup file 


The startup file is essential for initializing the firmware and it performs several critical tasks to ensure 
the system operates correctly from the moment it is powered on. 


These tasks include the following: 


e Implementing the vector table: This involves defining the vector table that maps interrupts to 
their handlers, ensuring the system can respond to various events efficiently. 


e Creating interrupt handlers: For each interrupt listed in the vector table, an interrupt handler 
must be implemented to define how the system responds to that particular event. 


e Establishing the firmware’s entry point: This refers to implementing Reset_ Handler, as 
specified in the linker script, which acts as the initial entry point of the firmware. This function 
is executed immediately after reset and is responsible for setting up the environment for the 
main application. 


e Transferring the .data section: This involves copying the . data section from FLASH to SRAM. 


e Zeroing the .bss section: This involves initializing the . bss section to zero, ensuring that all 
uninitialized global and static variables start with a known state for reliable operation. 


In the current folder containing the linker script, create a file called stm32£411_startup.cand 
make sure its extension is . c. If youre using Windows and it asks if you really want to change the file 
extension, click Yes. Then, right-click the file and open it with a basic text editor such as Notepad++. 


Let’s analyze the complete startup code. 


Writing the linker script and startup file 


The following is our complete startup code written in C, the contents of the stm32f£411_startup.c 


file. In the 
the vector 


extern 
extern 
extern 
extern 
extern 
extern 


following snippet, we are not showing all the function prototypes of all the interrupts in 
table. The complete source code can be found in the resources accompanying the book: 


lines 2aieemesizack:; 
Uints2 E letext; 
Uine o2 ic sdara; 
Uintes2mtmedaibar, 
tinca e FDBS? 
timega E ADEA 


voidikeset Handler(void)i 


int main (void); 
void NMI_Handler(void) attribute ((weak, 
alias ("Default Handler") ))j; 
void HardFault Handler (void) _ attribute ((weak, alias("Default_ 
Handler") )); 
void MemManage_ Handler (void) _ attribute ((weak, alias ("Default_ 
Handler"))); 
thung e Veeco Holl pedttrrbure (section(s dle veer cou — { 
(uint32_t)& _estack, 
(uint32_t) &Reset_Handler, 
(uint32_t)&NMI_ Handler, 
(uint32_t)&HardFault Handler, 
( ) 


u 


J; 


int32_t)&MemManage_Handler, 


void Default_Handler (void) { 


wh 


} 
} 


Hey í 


void Reset_Handler (void) 


{ 
Hi 


Calculate the sizes of the .data and .bss sections 


Chiesa © Cace men Salve = (viaria Ce ekite = ((bolinjes\2_ e Scere, 


tumeda E bS Men Slza = (thlinias2 ©) e Goss = (wuars? ce JIa, 


// 
// 


Initialize pointers to the source and destination of the .data 
section 
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Chines ic so Ere wen = (whines ie =) & Steep 
lines 2a E s9 Cest wem = (Uimts Jt) Susdatzar, 


/*Copy .data section from FLASH to SRAM*/ 
Toria ie a = Wp al e cata nmen Salyieg aie | 


{ 
} 


// Initialize the .bss section to zero in SRAM 
D cese lie = (wimi? © 2) i a6) 


*p_dest_mem++ = *p_src_mem++; 


Tormes ie a = Op al e ee wem eizaz Lad) 
/*Set bss section to zero*/ 
*p dest _mem++ = 0; 


// Call the application's main function. 
main(); 


} 
Let’s break it down. 


We start by declaring external symbols. 


External symbol declarations 


ekcema Whlinc3 te _ Ssicevele. 
extern Uints2 E CCS 
extern uint32_t _sdata; 
ekcema wiata ic _ edate, 
Grcexcia timc 32 © _ Sloss 
Grecia Wiacj2 © AS, 


These lines declare the external symbols that we defined in the linker script. Each symbol represents 
an important memory address used during the startup process: 


e _estack: This is the initial top of the stack. This value is loaded into the main stack pointer 
register early in the startup process. 


e _etext: This marks the end of the executable code section and the beginning of the data 
sections stored in flash memory. We use this as a reference point for copying initialized data 
from FLASH to SRAM. 
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e _sdataand_edata represent the start and end addresses of the initialized data section in 
SRAM, respectively. We use them to determine the size and destination for data copying from 
FLASH to RAM. 


e _sbssand_ebss mark the start and end of the uninitialized data section (BSS section) in 
SRAM. We use these symbols to clear this section, setting it to zero. 


Next in our snippet, we have the function prototypes and their attributes. 


Function prototypes and attributes 


void Reset _Handler (void); 


int main(void) ; 


void NMI _Handler(void) attribute ((weak, 
alias ("Default Handler") )); 


void HardFault Handler (void) attribute ((weak, alias ("Default 
Handler"))); 
void MemManage_ Handler (void) _ attribute ((weak, alias("Default_ 
Handler"))); 


At this part of the startup file, we declare the prototype for the Reset_Handler function, the 
application’s main function, and several interrupt handlers with specific attributes: 


__attribute  ((weak, alias("Default Handler") ) ) : This attribute makes each 
handler weakly linked and aliases it to a function named Default_Handler. It allows these handlers 
to be overridden by explicitly defined handlers with the same name elsewhere in the application. 


Let’s break down the statement further to understand its significance: 


e __attribute_: 


We use this keyword to tell the compiler that the declaration it’s applied to has certain properties 
that affect how it’s treated by the linker and, potentially, at runtime. Attributes can be used to control 
optimizations, code generation, alignment, and, relevant to our discussion, linkage characteristics. 


e weak: 


Declaring a function or variable as weak means that it does not prevent the linker from 
using another symbol of the same name with a stronger linkage. We use this to specify default 
implementations that can be overridden. 


In the context of our interrupt handlers, marking them as weak allows us to define default 
handlers in our startup file, which application-specific handlers can override without modifying 
the startup file. 
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e alias ("Default _Handler"): 


This part of the attribute creates an alias for another symbol, in this case, Default_Handler. 
It means that the symbol (e.g., NMI_ Handler) is not just weak, but it is also an alias for the 
Default Handler function. 


This means that when an interrupt occurs, and a specific handler (such as NMI_ Handler) has 
not been defined elsewhere in the application with stronger linkage (non-weak), the program 
will use Default_Handler in its place. This ensures that all interrupts have a handler, 
preventing the system from crashing due to unhandled events. 


Next, we have the vector table array. 


Vector table definition 


tukae e vector Colil (matenibuten) (Section! sVsrivector, Chill) = { 
(uint32_ t)& estack, 

uint32_t)&Reset_Handler, 

uint32_ t)&NMI Handler, 

uint32_ t) &HardFault Handler, 


( 
( 
( 
(uint32 t) &MemManage Handler, 


) 
) 
) 
) 


IPR 


This array defines the microcontroller’s interrupt vector table, placed in the .isr_vector_tbl 
section we defined in the linker script. 


We set the & estack symbol as the first element of the vector table to define the initial top of the stack 
in memory. In ARM Cortex microcontrollers, such as our STM32F411, the first word (32 bits) of the 
vector table must contain the initial value of the main stack pointer (MSP). Upon reset, the processor 
loads this value into the MSP register to set up the stack pointer correctly before executing any code. 


Following this, we specify the address of Reset_ Handler, then we proceed to list the addresses for 
NMI_Handler and other subsequent interrupt handlers in sequence. The precise placement of these 
handlers is crucial, as each must reside in a specific memory location to ensure correct functionality. 
This arrangement is detailed on page 201 of the RM0383 document. Within the fully defined vector 
table in our stm32£411_startup.c file, you'll notice that there are zeros strategically placed among 
the interrupt handler addresses. These zeros act as placeholders for the positions corresponding to 
interrupts not supported by our specific microcontroller variant (STM32F411). The ARM Cortex-M 
core architecture is designed to support a comprehensive set of interrupts, yet not all interrupts are 
implemented across every microcontroller variant. By inserting zeros for these unsupported interrupts 
in the vector table, we maintain the required alignment with the architecture's specifications, ensuring 
the system operates correctly. 
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Let’s take a closer look at the array declaration: 


uint32 t vector tbl[] _ attribute ((section(".isr vector tbl") ))={..} 


e uint32 t vector tbl[]: 


This specifies that each element of the vector_tbl array is an unsigned 32-bit integer. We 
chose this type because addresses in ARM Cortex-M microcontrollers are 32 bits in length, and 
the vector table consists of memory addresses pointing to the start of ISR handlers. 


e —_attribute_ ((section(".isr_vector_tbl"))): 


This attribute instructs the linker to place the vector_tbl array in a specific section of the 
output file named .isr_vector_tbl. 


Next, we have our default handler function. 


Default dandler 


void Default Handler(void) { 
while(1) { 
// Infinite loop 


} 


This function serves as a universal fallback for any interrupt request for which a specific handler has 
not been implemented. Engaging in an infinite loop effectively prevents the program from proceeding 
into an undefined state following such an event. 


It is linked to all interrupt handlers marked as weak and aliased to Default_Handler within 
the application. This strategy ensures a uniform and secure response throughout the system to any 
interrupt requests that lack a dedicated handler, thus upholding system stability and integrity. 


Finally, we have our Reset_Hand1ler function. 


Reset handler implementation 


void Reset Handler(void) { 


Chie A E Cece men Gilze = (varoa Ele eckite = ((bilinjes\4_ic)) ts Scere, 
PhidesA ic boe men Gilze = (uint32_t)& ebss - (uint32_t)& sbss; 
PhiNesA e “jo ere mem = (Witness A © Je ereny 

toimega e s9 esr maw = (wihinz32 © *)ieisdatal; 

for (uint32_t i = 0; i < data_mem_size; i++ ) { 


*p_dest_mem++ = *p_src_mem++; 
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} 


D Cesc nemn = (wales o Se AES; 

for(uint32 e i = 0; i < bss mem_size; i++) { 
*p dest _mem++ = 0; 

main(); 


} 


The job of Reset Handler is to prepare the system before executing the main application. 
In the function, we start by calculating the sizes of the .data and .bss sections: 


tiata c Cate man olza = (wolintesA ic)& carca = (lacs? ce garep 
tinta ic boss mem giza = (thine A Ele Close = (Walaa? c) G57 


Here is a breakdown: 


e Calculate the size of the . data section by subtracting the address of the start of the section 
(_ sdata) from the address of the end (_edata). This size is used to copy initialized data 
from FLASH to SRAM. 


e Calculate the size of the . bss section in a similar manner, using the start (_sbss) and end 
(_ ebss) addresses. This size is used to zero out the . bss section in SRAM. 


Next, in the function, we initialize pointers for copying the . data section: 


Canes © 59 Sme men |  (UelME!A te <Je Slices 
taea ie jo) dest uem = (uime? c <Je Cate, 


Here is the breakdown: 
e Initialize a source pointer (p_src_mem) to the address where initialized data is stored in flash 
memory, marked by etext. 


e Initialize a destination pointer (p_dest_mem) to the start of the . data section in SRAM (_ 
sdata). 
Then, we copy the . data section from FLASH to SRAM: 


for(uint32 t i = 0; i < data_mem_size; i++ ) { 
*p dest ment = *prusre mem++; 
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The breakdown: 


e Copy the .data section from FLASH to SRAM word (32-bit) by word. For each iteration, the 
content pointed to by p_src_memis copied to the location pointed to by p_dest_mem, and 
then both pointers are incremented to the next word. 


Next, we initialize the pointer for the . bss section zeroing: 


3 Cesc mem = (bubimcsa_ =) & SSS; 


We simply reset the destination pointer (p_dest__mem) to the start of the . bss section in SRAM 
(_sbss), preparing it for zeroing. 


We then zero out the . bss section: 


for(uint32_t i = 0; i < bss mem size; i++) { 
*podest mem = 0); 


} 


This block zeroes out the . bss section in SRAM word by word. For each iteration, the location pointed 
to byp_dest_memis set to 0, and then p_dest_memis incremented to the next word. 


Finally, we call the main () function located in the main. c file of our source code: 


main(); 


After initializing the .data and .bss sections, this line calls the main function, transferring 
control to the main application code. This marks the end of the system initialization process and the 
beginning of the application execution. 


Now that we have completed both our linker script and startup file, it is time to test our implementation 
by building the firmware using just our main.c source file, the stm32_1s.1d linker script, and 
the stm32£411_startup.c startup file. 


Testing our linker script and startup file 


Before diving into the command line, it’s important to ensure that our linker script and startup file 
are correctly placed. Let’s set up our project directory and add some modification to our main. c file: 


1. Create a new directory: Create a new folder named 3_ LinkerAndStartup in your workspace. 


2. Transfer required files: 


" Locate the main. c file from the previous project (2. RegisterManipulation), which 
includes the foundational application code. 
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" Additionally, locate stm32_1s.1d (linker script) and stm32£411_startup.c 
(startup file). 


* Copy and paste these files (stm32_1s.1d, stm32f411_startup.c,andmain.c) 
into the 3. LinkerAndStartup folder. 


Modify the main application code: To visually confirm the successful execution of our latest 
firmware, let’s adjust the LED blink rate within the main . c file from fast to slow: 


* Open the main.c file for editing: Right-click on the main. c file within the 3 _ 
LinkerAndStartup folder and select the option to open it with a simple text editor, 
such as Notepad++. 


* Change the blink rate: Search for the code segment that controls the toggling of PA5 
(LED_ PIN). Adjust the delay intervals within this section to change the LED’s blink rate 
from its current rapid pace to a slower one. The current one should look like this: 


// 22: Toggle PA5(LED PIN) 
GPIOA_OD R ^= LED PIN; 
for(int i = 0; i < 100000; i++) {} 
Replace the current code with the following snippet to toggle the state of PAS at a slower rate: 


// 22: Toggle PA5(LED_ PIN) 
GPIOA_OD R “= LED PIN; 
for(int i = 0; i < 5000000; i++) {} 


We are simply changing the loop iteration from 100,000 to 5,000,000. 


Save the changes: After updating the code, save the main. c file. 


Now, lets access our new folder through the Command Prompt following the steps we learned in 
Chapter 3. My favorite method for Windows users is the context menu method: 


Navigate to the 3. LinkerAndStartup folder in Windows Explorer. Once there, hold down the 
Shift key, right-click in a space within the folder, and select Open Command Window Here. This 
action will open a Command Prompt window directly in the 3_ LinkerAndStartup folder. 


In the Command Prompt, we start by compiling the main. c file, and we do this by executing 
the following: 


arm-none-eabi-gcc -c -mcpu=cortex-m4 -mthumb -std=gnull main.c -o 
main.o 


Then, we execute our startup file: 


arm-none-eabi-gcc -c -mcpu=cortex-m4 -mthumb -std=gnull stm32f411_ 
startup.c -o stm32f411 startup.o 


Writing the linker script and startup file 


Once our main.o and stm32£411_startup.o object files are ready, we go ahead and link all 
object files (* . o) using our linker script: 


arm-none-eabi-gcc -nostdlib -T stm32 ls.1ld *.o -o 3_LinkerAndStartup. 
elf.elf 


This process produces the 3. LinkerAndStartup.el1f executable. 
Next, we launch openocd to begin the uploading process: 


openocd -f board/st_nucleo f4.cfg 


With OpenOCD running, the next step involves using the GNU Debugger (GDB) to upload the 
firmware to the microcontroller. Let’s access another Command Prompt window (as OpenOCD should 
keep running in the first one) and enter the following command to start the GDB: 


arm-none-eabi-gdb 

Once GDB is open, we establish a connection to our microcontroller by running: 
target remote localhost:3333 

Let’s reset and initialize the board as we learned in Chapter 3 using the following command: 
monitor reset init 

Next, we load the firmware onto the microcontroller using the following command: 
monitor flash write image erase 3 LinkerAndStartup.elf 

After successfully loading the firmware, we reset the board again with the same reset command: 
monitor reset init 

Finally, we resume the execution of the firmware on the microcontroller with the following: 


monitor resume 


There you have it; you should see the LED blinking at a slower rate, indicating the successful upload 
and execution of our new firmware. 
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Summary 


In this chapter, we deeply explored the core components of embedded bare-metal programming, 
focusing on the microcontrollers memory model, writing linker scripts, and startup files. We began by 
exploring the STM32 microcontroller’s memory layout, emphasizing the importance of flash memory 
and SRAM for storing executable code and runtime data. 


We then dedicated a significant portion of the chapter to constructing and understanding linker scripts. 
Through this, we understood these scripts’ critical role in the firmware build process by mapping the 
compiled firmware sections to the microcontroller’s specific memory regions to ensure the executable 
operates correctly. We learned about the various directives within a linker script, such as MEMORY 
and SECTIONS. These directives are crucial for defining the memory layout and specifying where 
and how program sections are placed in memory. 


Our discussion on linker scripts extended to the practicalities of defining memory regions, aligning 
sections, and managing section attributes for optimal memory utilization. We gave special attention 
to the LMA and VMA, which are essential for efficient program loading and execution. 


Transitioning to the startup file, we meticulously outlined the startup file’s role, covering the initialization 
of the vector table, setting up Reset_Handler, and preparing the system for the execution of the 
main application. We learned the procedures for copying the . data section from FLASH to SRAM 
and zeroing the . bss section, ensuring a predictable start for our firmware. 


In the next chapter, we will explore build systems, highlighting the essential role of the Make tool. 
This knowledge will enable us to streamline our build process by automating it, instead of manually 
entering each command in the command line. 


5 
The “Make” Build System 


In this chapter, we will learn how to automate our entire build process using build systems, specifically 
focusing on the make build system - an indispensable tool for automating the compilation and linking 
processes in software development. We start by defining what a build system is and then exploring 
its fundamental purpose, which primarily involves automatically transforming source code into 
deployable software, such as executables or libraries. 


Throughout the chapter, we will systematically uncover the components of the make build system, 
starting with the essential elements of a Makefile, including targets, prerequisites, and recipes. In the 
latter part of the chapter, I will provide a step-by-step guide on writing a Makefile, highlighting the 
syntax and structure necessary to execute builds effectively. 


In this chapter, were going to cover the following main topics: 
e An introduction to build systems 
e The Make build system 


e Writing Makefiles for firmware projects 


By the end of this chapter, you will have a solid understanding of how to leverage the make build 
system to streamline your development process, improve build times, and reduce manual errors in 
building and deploying your firmware. 


Technical requirements 


All the code examples for this chapter can be found on GitHub at https: //github.com/ 
Packt Publishing/Bare-Metal -Embedded-C- Programming. 
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An introduction to build systems 


In the world of software development, build systems are pivotal tools that enable the transformation 
of source code into executable programs or other usable software formats. These systems automate the 
process of compiling and linking code, managing dependencies, and ensuring that software builds are 
reproducible and efficient. Simply put, a build system refers to a set of tools that automate the processes 
of compiling source code into binary code, linking binaries with libraries, and packaging the results 
into deployable software units. These systems are designed to handle complex dependency chains by 
tracking which parts of a software project need recompilation, thereby optimizing the build process. 
Build systems are responsible for a range of tasks. These include the following: 


e Dependency management: This involves identifying and resolving interdependencies among 
various components or libraries that software requires. 


e Code compilation: Converting source code, whether it’s written in C, C++, or another 
programming language, into machine-readable object code. 


e Linking: This process integrates the compiled object files and necessary libraries into a unified 
executable or library file. 


e Packaging: This step prepares the software for deployment, which might include creating 
installer packages or compressing software into distributable archives. 


e ‘Testing and validation: Executing automated tests to confirm that software adheres to the 
predefined quality benchmarks before its release. 


e Documentation generation: Build systems can also automate the creation of documentation. 
This is achieved by integrating with tools such as Doxygen for C/C++, Javadoc for Java, or Sphinx 
for Python, which extract annotated comments and metadata from source code to produce 
structured documentation. This automation ensures that documentation stays synchronized 
with changes in the source code, thereby maintaining consistency and reducing manual errors. 


By incorporating these diverse functions, build systems significantly boost the efficiency and reliability 
of the software development process. Modern software projects often involve complex configurations, 
including thousands of source files and a wide array of external dependencies. Build systems provide a 
crucial framework to manage these complexities efficiently. They automate repetitive tasks, minimize 
the likelihood of human errors, and guarantee consistent builds across various environments. This 
streamlining not only enhances productivity but also supports the adoption of continuous integration 
and continuous delivery practices, which are essential for timely and effective software delivery. 


Choosing a build system depends on various factors, including the programming language used in 
a project, the platform compatibility required, and the development team’s familiarity with the tool. 
Some of the commonly used build systems include make and maven. 


An introduction to build systems 


Make 


Make is one of the oldest and most fundamental build systems available. It is primarily used for C 
and C++ projects. Make uses Makefiles to specify how to compile and link source files. Its primary 
advantage lies in its simplicity and broad support across different platforms. 


Its key features include the following: 


Flexibility: Make allows us to define explicit rules on how files should be compiled and linked. 


Wide support: It is available on most Unix-like systems as well as Windows. Also, make can 
be used with a variety of compilers and programming languages. On Windows, make can be 
used in environments such as Minimalist GNU for Windows (MinGW) or Cygwin. 


Next, let’s look at maven. 


Maven 


Maven is primarily used for Java projects. It is designed to provide a comprehensive and standard 
framework for building projects, handling documentation, reporting, dependencies, Source Control 
Management (SCM) systems, releases, and distribution. The key features include the following: 


Convention over configuration: Maven uses a standard directory layout and a default build 
life cycle to decrease the time spent on configuring projects 


Dependency management: It can automatically download libraries and plugins from repositories 
and incorporate them into the build process 


Project information management: Maven can generate project documentation, reports, and 
other information from a project’s metadata 


Build and release management: Maven supports the entire build life cycle, from compilation, 
packaging, and testing to deployment and release management 


Extensibility: Maven’s plugin-based architecture allows it to be extended with custom plugins 
to support additional tasks 


Other notable build systems include Apache Ant, which is a Java-based build system, and Gradle, 
which supports multiple programming languages but is especially favored within the Java ecosystem. 
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Before exploring the specifics of the make build system, it is important to familiarize ourselves with 
the fundamental components of build systems. These components form the backbone of the build 
process and include the following: 


e Source code: The raw, human-readable code written in programming languages such as C, 
Java, and Python. 


e Compiler: The tool that converts source code into machine-readable object files. Common 
compilers include GCC for C/C++ and javac for Java. 


e Linker: The tool that combines object files into a single executable or library file. 


e Build scripts: Scripts that describe the build process. They define what commands need to be 
run and their order. 


e Dependencies: External code libraries or tools required by a project that need to be integrated 
during the build process. 


e Artifacts: The output of build systems, which can include executables, libraries, or other formats 
needed to deploy or run software. 


In the upcoming section, we will explore the fundamentals of the make build system and learn how 
to write Makefiles that automate the build process for firmware projects. 


The Make build system 


In this section, we will explore the Make build system, from its basic concepts to practical usage in 
firmware development. 


The basics of Make 


The primary component of the make build system is the Makefile, which contains a set of directives 
used by the tool to generate a target. At its core, a Makefile consists of rules. Each rule begins with a 
target, followed by prerequisites, and then a recipe: 


e Target: This is typically the name of the file that the rule generates - in other words, the output 
file that needs to be generated, such as main.o or app. exe. The target can also be the name 
of the action to carry out. 


e Prerequisites: These are the source files needed to create the target (e.g., main.c and adc . c). 


e Recipes: The recipe is a series of commands that make executes in order to build the target. 


The Make build system 


The following diagram illustrates a simple make rule: 


target prerequisite 


main.o : main.c 
arm-none-eabi-gcc main.c -o main.o 


recipe 


Figure 5.1: A Make rule, with main.o as the target file to be generated from the 
prerequisite, main.c, using the arm-none-eabi-gcc main.c -o main.o recipe 


Note 


The line of the recipe must start with a tab. 


Makefiles also allow us to use variables to simplify and manage complex build commands and 


configurations. For instance, we can define a variable name, CC, to represent the compiler command, 
as shown in Figure 5.2: 


variable 
value 


Y 


CC = arm-none-eabi-gcc 


main.o : main.c 
$(CC) main.c -o main.o 


recipe 


Figure 5.2: A Make rule, with main.o as the target file to be generated from the prerequisite, 
main.c, where the compiler command in the recipe is replaced by a variable 
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Variables in Makefiles allow us to store text strings that can be reused throughout a file. The most 
basic way to define a variable is by simple assignment (=): 


CC = arm-none-eabi-gcc 


This line sets the CC variable to the arm-none-eabi -gcc cross-compiler. 


Once defined, variables can be used throughout the Makefile to simplify commands and definitions. 
To use a variable, enclose its name in$(...) or ${...}: 


$(CC) main.c -o main.o 


The recipe uses the $ (CC) variable to refer to the compiler set earlier (arm-none-eabi-gcc). 


Apart from user-defined variables, there are special variables related to targets and prerequisites that 
come in handy when writing Makefiles. 


In make, special variables related to targets help streamline the process of specifying filenames and 
file paths, making the rules within a Makefile more general and reusable. One of the most commonly 
used special variables for targets is '$@' - Target Name. This variable represents the name of 
the target for the rule. It is particularly useful when the target name is repeated multiple times within 
a rule, which is common in link and compile commands. 


Let’s see an example: 


main.o : main.c 


arm-none-eabi-gcc main.c -o $@ 


In this example, $@ is replaced by main . o, which is the target of the rule. 


Make also provides special variables to reference prerequisites. One of the most commonly used is 
$^. This variable lists all the prerequisites of a target, with spaces between them (if more than one). 
Let’s see an example: 


main.o : main.c 


arm-none-eabi-gcc $^ -o main.o 


When the preceding snippet of the Makefile is executed, $^ is replaced with main . c, effectively 
running the arm-none-eabi-gcc main.c -o main.o command. 


Special variables in Makefiles are very useful in improving efficiency and flexibility when defining build 
rules. By effectively using these variables, we can create more robust and maintainable build systems. 


We will conclude this section with Figure 5.3. This figure illustrates the revised rule from Figure 5.2, 
now incorporating our user-defined variable along with the special variables related to targets 
and prerequisites. 
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CC = arm-none-eabi-gcc 


main.o : main.c 


$(CC) $^ -o $@ 


| 
recipe 


Figure 5.3: A Make rule using a user-defined variable and two special variables 


In the next section, I will guide you through the process of setting up a make build system on your 
development computer. 


Installing and configuring Make 


In this section, we will go through the process of downloading, installing, and configuring the make 
build system on a Windows environment. Let’s begin: 


1. Download make: We begin by navigating to the appropriate website to download GNU Make 
for Windows. For this example, we'll use SourceForge, a popular repository for open source 
projects. Go to https: //gnuwin32.sourceforge.net/packages/make.htm. 


Under the Complete package, except sources option description, click Setup under the 
Download column to start downloading. 


2. Install make: Once the download is complete, follow the installation wizard to install make 
on your computer. When you reach the step titled Select Destination Location, make sure 
the path is set to C:\Program Files (x86) \GnuWin32. 


3. Set up the environment variable: To use make from any command line or script, we need to 
add its executable to our system environment variables, following the same process we used 
in Chapter 1 to add OpenOCD to the environment variables. 


We do this by navigating to the bin folder where make was installed (C:\Program Files 
(x86) \GnuWin32\bin) and then copying the path. 


Then, we do the following: 

I. Right-click on This PC, and then choose Properties. 

II. Search for and select Edit the system environment variables. 

II. Click the Environment Variables button in the System Properties pop-up window. 


IV. Under the User Variables section of the Environment Variables pop-up, double-click 
the Path entry. 
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V. Inthe Edit environment variable popup, click on New to create a row for a new path entry. 
VI. Paste the previously copied make path into this new row. 


VII. Confirm your changes by clicking OK on the various pop-up windows. 


To confirm that the make is properly set up, open command prompt and simply type make, as 
shown here: 


make 


It should return the following: 


make: *** No targets specified and no makefile found. Stop. 


This confirms that the make build system is properly configured on the Windows machine. 


On many Linux distributions, make is readily available through the distribution’s package manager. 
For instance, on Ubuntu and other Debian-based distributions, we can install make (along with other 
build essentials, such as the GCC compiler) by running the following: 


sudo apt install build-essential 


On macOS, make is part of the Command Line Tools package that comes with Xcode, Apple's suite of 
development tools. This means that if you have installed Xcode, you will have make already installed. 
We can also install the standalone Command Line Tools package by running the following command 
in the terminal: 


xcode-select --install 


In this section, we successfully set up the make build system on our development machine. In the 
following section, we will apply the concepts covered in this chapter to write our own Makefile. 


Writing Makefiles for firmware projects 


The focus of this section is to write a Makefile and successfully test it. Let's begin. 


In our workspace folder, lets make a new folder named 4_Makefiles. In this folder, create a file 
called Makefile. This file must start with a capital M and should have no extension. 


If you're using Windows and it asks whether you really want to change the file extension, click Yes. 
Then, right-click the file and open it with a basic text editor, such as Notepad++. 


Writing Makefiles for firmware projects 


Our objectives with the Makefile can be summarized as follows: 
1. The compilation of source code: We want to compile source files (main.cand stm32f£411_ 
startup.) into object files (main.o and stm32f411_startup.o). 


2. Linking object files into an executable: Then, link the compiled object files, along with 
setting a specific memory layout using the linker script, (stm32_1s . 1d) to create a final 
executable (4_ makefile project.elf). 


3. Loading and cleaning: 
I. Invoke OpenOCD to start the process of uploading the final executable onto the 
target hardware 


I. Conclude by providing a way to clean the build environment by removing all generated 
files (* .o, *.e1£., and * . map), allowing for a fresh start with no leftover artifacts 
from previous builds 


This is our complete Makefile: 


muned g Al ine Ueeesl ILS) project -eli 


main.o : main.c 


arm-none-eabi-gcc -c -mcpu=cortex-m4 -mthumb -std=gnull main.c -o 
main.o 


stm32f411 startup.o : stm32f411 startup.c 
arm-none-eabi-gcc c -mcpu=cortex-m4 -mthumb -std=gnull stm32f411_ 
Starkupye —o Stms2 £41 istareupre 


4amake ri Vemprojechnclr s-amainroOms EMS 24st Up. © 
anm-none-eabi—gee —-nostdlaby i stm3 2s pide * oo 4umalketivlies 
project.elf -Wl,-Map=4 makefile project.map 


load 

openocd -f board/st_nucleo_f£4.cfg 
elean; 
del -f *.0 *.elf *.map 
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Let’s break it down: 


1. 


Create the final target: final : 4 makefile project.elf 
This line deals with the creation of the final target. 


When we execute make final, make will check whether the 4 makefile project. 
elf target needs to be updated before executing it. This is a dependency relationship where 
final acts purely as an aggregation point to invoke all the build processes leading up to 4__ 
makefile project.elf. 


Create an object file for main.c: 
main.o : main.c 


arm-none-eabi-gcc -c -mcpu=cortex-m4 -mthumb -std=gnull main.c 
-O main.o 


The make rule compiles the main.c source file into an object file named main.o. Upon 
close inspection, you can see that the command employed here is identical to the one we used 
at the command prompt to manually compile main. c in the previous chapter. 


Create an object file for stm32f411_startup.c: 
stm32f411 startup.o : stm32f411 startup.c 


arm-none-eabi-gcc -c -mcpu=cortex-m4 -mthumb -std=gnull stm32f411_ 
startup.c -o stm32f411 startup.o 


This compiles the stm32£411_startup.c source file into an object file, named stm32£411_ 
startup.o. 


Link to create the final executable: 
4 makefile project.elf : main.o stm32f411 startup.o 


arm-none-eabi-gcc -nostdlib -T stm32 1s.1ld *.o -o 4 makefile_ 
project.elf -W1,-Map=4 makefile project.map 


This rule links the main.oand stm32£411_startup.o object files to produce the final 
executable, 4 makefile project.elf. Additionally, it generates a map file named 
4 makefile project .map that shows where each part of the code and data is loaded 
in memory, which is useful for debugging. 


Load the target: 
load: 
openocd -f board/st_nucleo f4.cfg 


This rule initiates OpenOCD to begin the process of loading the final executable onto the target 
hardware. It executes OpenOCD, using a configuration file tailored for the STM32 Nucleo F4 
board, specifically st_nucleo_f£4.cfg. 
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6. Clean the target: 
clean: 
del -f *.0 *.elf *.map 


This command cleans the build directory by removing all generated files, ensuring a clean 
environment for subsequent builds. The del -f command forcefully deletes files, preventing 
prompts that ask for deletion confirmation. The * . o, * .e1f, and * . map patterns specify that 
all object files, ELF executables, and map files, respectively, should be deleted. 


With the Makefile ready, it is time to test it out. 


Testing our Makefile 


Before proceeding to the command line, let’s ensure that the linker script, startup file, and all source 
files are placed in the correct directory. Additionally, we will slightly update the main . file, which 
will allow us to validate that the most recent version of the firmware executes correctly: 


1. Transfer the required files: 


I. Locate the main. c file from the previous project (3_ LinkerscriptAndStartup), 
which includes the foundational application code. 


II. Additionally, locate stm32_1s.1d (the linker script) and stm32£411_startup.c 
(the startup file). 


II. Copy and paste these files (stm32_1s.1d,stm32f£411_startup.c,andmain.c) 
into the 4. Makefiles folder. 


2. Modify the main application code: To visually confirm the successful execution of our latest 
firmware, let’s adjust the LED blink rate within the main. c file from slow to fast: 


IL. Open the main.c file for editing: Right-click on the main.. file within the 4 __ 
Makefiles folder and select the option to open it with a simple text editor, such 
as Notepad++. 


I. Change the blink rate: Search for the code segment that controls the toggling of PAS 
(LED_ PIN). Adjust the delay intervals within this section to change the LED’s blink 
rate from its current slower pace to a rapid one. The current one should look like this: 


// 22: Toggle PAS(LED PIN) 
GPIOA OD_ R “= LED PIN; 
for(int i = 0; i < 5000000; i++) {} 
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Replace the current code with the following snippet to toggle the state of PAS to a faster rate: 


// 22: Toggle PA5(LED_ PIN) 
GPIOA_OD_ R “= LED PIN; 
for(int i = 0; i < 100000; i++) {} 


We simply change the loop iteration from 5 million to 100,000. 


III. Save the changes: After updating the code, save the main.c file. 


Now, let’s access our new folder through the command prompt, following the steps we used in the 
previous chapter. 


In the command prompt, simply execute the following: 
make final 
This will create our final executable, 4. makefile project.elf. 
Next, we will begin uploading the final executable onto our microcontroller by executing the following: 


make load 


This will launch OpenOCD. The next step involves using the GNU Debugger (GDB) to upload the 
firmware to the microcontroller, as we did in the previous chapter. Let’s access another command 


prompt window (as OpenOCD should keep running in the first one) and enter the following command 
to start the GDB: 


arm-none-eabi-gdb 

Once GDB is open, we establish a connection to our microcontroller by running the following: 
target remote localhost: 3333 

Let’s reset and initialize the board, as we learned in Chapter 3, using the following command: 
monitor reset init 

Next, we load the firmware onto the microcontroller using the following command: 
monitor flash write image erase 4 makefile project.elf 

After successfully loading the firmware, we reset the board again with the same reset command: 
monitor reset init 

Finally, we resume the execution of the firmware on the microcontroller with the following: 


monitor resume 
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You should see the LED blinking at a rapid rate, indicating the successful upload and execution of 
our new firmware. 


We can stop the GDB by executing the following: 
quit 

And then, we execute the following when asked if we want to quit anyway: 
Y 


To clean our build directory, we will open the command prompt in the build directory and execute 
the following command: 


make clean 
This will delete all the .0, .e1f, and . map files in the build directory. 
Before concluding this chapter, let’s explore how our Makefile appears when we incorporate special 
and user-defined variables. 
Applying special and user-defined variables 
Let’s apply special variables and user-defined variables to our makefile: 


cc = arm-none-eabi-gcc 
CFLAGS = -c -mcpu=cortex-m4 -mthumb -std=gnull 
LDFLAGS = -nostdlib -T stm32_ls.ld -W1l,-Map= 5 makefile project_v2.map 


Timell 9 5 mekerile project v2 eli 


main.o : main.c 
$(CC) $(CFLAGS) $^ -o $@ 


stm32f411 startup. o : stm32f411 startup. c 
$(CC) $(CFLAGS) $^ -o $@ 


5 meikerile project vA elm g Teim O SEMATA SEENT 
$(CC) $(LDFLAGS) $^ -o $@ 


load 

openocd -f board/st_nucleo_f4.cfg 
clean: 

del -f *.0 *.elf *.map 
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In this version, weve defined three essential variables to streamline our Makefile: 


e CC: This variable represents the compiler used to compile the source files. It simplifies the 
Makefile by centralizing the compiler definition, making it easier to update or change if needed. 


e CFLAGS: This holds the compilation flags necessary to build the source files. 


e LDFLAGS: This contains the linker flags that dictate how an executable is linked from the 
object files. 


Additionally, weve used special variables for target names ($@) and prerequisite lists ($^) to replace 
explicit mentions of these components in the make recipes, further simplifying the Makefile structure. 


Update your current makefile to this new version and upload the new executable, named 5_makefile_ 
project_v2.el1f, onto your microcontroller. This updated version should function seamlessly, 
just like the previous one. 


Summary 


In this chapter, we embarked on an exploration of the make build system, a cornerstone tool for 
automating the build process in software development. The journey began with an introduction to 
what build systems are and their critical role in converting source code into deployable software, such 
as executables and libraries. 


We then delved into the specific mechanics of the make build system, starting with the foundational 
elements of a Makefile, which include targets, prerequisites, and recipes. These components were 
thoroughly discussed to provide a clear understanding of how they interact within make to manage 
and streamline the compilation and linking of software projects. 


This chapter wrapped up with a practical demonstration of writing Makefiles, effectively consolidating 
the theoretical concepts discussed throughout. This hands-on experience ensures that you are well- 
equipped to apply these strategies to your own firmware projects. 


In the next chapter, we will transition to another critical aspect of firmware development - the 
development of peripheral drivers, beginning with General Purpose Input/Output (GPI/O) drivers. 
This shift will introduce you to the fundamentals of interfacing with hardware components, a pivotal 
skill in embedded systems development. 
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The Common Microcontroller 
Software Interface Standard 
(CMSIS) 


In this chapter, we will delve into the Common Microcontroller Software Interface Standard (CMSIS), 
a critical framework for Cortex-M and some Cortex-A processors. We will begin by learning how to 
define hardware registers using C structures. This foundational knowledge will enable us to read and 
understand CMSIS-compliant header files provided by microcontroller manufacturers. 


Next, we will explore CMSIS itself, discussing its components and how it facilitates efficient software 
development. Finally, we will set up the necessary header files from our silicon manufacturer, 
demonstrating how CMSIS compliance can streamline production and improve code portability. 


In this chapter, were going to cover the following main topics: 


e Defining peripheral registers with C structures 
e Understanding CMSIS 
e Setting up the required CMSIS files 


By the end of this chapter, you will be equipped with a solid understanding of CMSIS and how to use 
it to enhance code portability in your Arm Cortex-M projects. 


Technical requirements 


All the code examples for this chapter can be found on GitHub at https: //github.com/ 
Packt Publishing/Bare-Metal -Embedded-C- Programming. 
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Defining peripheral registers with C structures 


In embedded systems development, defining hardware registers using C structures is a fundamental 
technique that enhances code readability and maintainability. In this section, we will explore how 
to use C structures to represent peripherals and their registers, drawing on practical examples and 
analogies to simplify the concept. 


In previous chapters, we configured a General Purpose Input/Output (GPIO) pin (PAS) to turn on 
an LED by manually defining the address of each required register. We learned how to find the correct 
addresses from documentation, define registers, and define register bits. This method, while effective, 
can become cumbersome as projects grow in complexity. 


To streamline this process, we can use C structures to represent peripherals and their registers. This 
approach groups related registers into a cohesive unit to match the hardware architecture and memory 
map of our microcontroller, making the code more intuitive. 


Let’s create a structure to represent the GPIO peripherals and their associated registers. 


To achieve this, we need to get the base address of each GPIO port and the offset of each register 
within these ports. Here, offset refers to the register’s address relative to the peripheral’s base address. 


Before diving into the details of creating the structure, it’s important to understand how to obtain the 
necessary base addresses and offsets. 


Getting the base address and offsets of registers 


In Chapter 2, we learned how to locate the base addresses of peripherals in our datasheet. Specifically, 
we examined pages 54 to 56 of the STM32F411 datasheet, which lists the base addresses for the 
microcontroller’s peripherals. Here are the extracted base addresses for the GPIO and Reset and 
Clock Control (RCC) peripherals: 


Peripheral Base Address 


0x4002 0000 
0x4002 0400 
0x4002 0800 
GPIOD 0x4002 0C00 
GPIOE 0x4002 1000 
GPIOH 0x4002 1C00 
RCC 0x4002 3800 


Table 6.1: Base addresses of GPIO and RCC 
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Also, in Chapter 2, we covered how to extract the register offsets from the reference manual (RM383). 
Here are the extracted offsets for all the GPIO registers: 


Register Offset 


GPIOx_MODER 0x00 
GPIOx_OTYPER 0x04 


GPIOx_OSPEEDR 0x08 


GPIOx_PUPDR 0x0C 
GPIOx_IDR 0x10 
GPIOx_ODR 0x14 
GPIOx_BSRR 0x18 
GPIOx_LCKR 0x1C 
GPIOx_AFRL 0x20 
GPIOx_AFRH 0x24 


Table 6.2: Offsets of GPIO register 


Table 6.1 shows all the GPIO registers of our STM32F411 microcontroller along with their offsets, 
arranged in the same order they appear in memory. Almost all registers in our microcontroller are 
32 bits (4 bytes) in size. As illustrated in Table 6.1, each register is offset by 4 bytes from the previous 
one. For instance, the GPIOx_OTYPER register at 0x04 is 4 bytes from the GPIOx_MODER register 
at 0x00 (0x04 - 0x00 = 4). Similarly, the GPIOx_PUPDR register at 0xOC is 4 bytes from the 
GPIOx_OSPEEDR register at 0x08. 


This tells us that the registers are contiguously arranged in that memory region, since we know that 
each register is 4 bytes in size. 


However, this contiguous arrangement is not always the case. There are instances where gaps of a few 
bytes are left between registers within a peripheral. 


Now, whats the relationship between the offset and the base address? 


Imagine your microcontroller as a large apartment complex. Each apartment represents a peripheral, 
such as GPIO or RCC, and the entrance to each apartment is the peripheral’s base address. Inside each 
apartment, there are several rooms, which represent the registers. Each room has a specific purpose 
and is located at a certain distance from the entrance, known as the offset. 
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For example, when you enter the GPIO apartment (peripheral), the living room might be the GPIOx_ 
MODER register located right at the entrance (offset 0x00). The kitchen could be the GPIOx_OTYPER 
register, located a bit further down the hallway (offset 0x04). The bedroom might be the GPIOx_ 
OSPEEDR register, located even further down the hallway (offset 0x08), and so on. 


This arrangement shows that each room (register) is placed at a fixed distance (offset) from the 
entrance (base address). In our case, since each room is 4 bytes in size, every subsequent room is 4 
bytes away from the previous one. However, in some apartments (peripherals), there might be extra 
space between rooms, indicating non-contiguous placement of registers. This is similar to having a 
small hallway between rooms, which you'll notice when you examine peripherals such as the RCC 
peripheral in the reference manual. 


Now, let’s go ahead with implementing the peripheral structures using what we have learned so far. 


Implementing the peripheral structures 
The following is our GPIO_TypeDef structure, representing the GPIO peripherals: 


typedek& struct 


{ 
volatile uint32_ t MODER; /*offset: 0x00 * / 
volatile uint32_t OTYPER; /*offset: 0x04 * / 
volatile uint32_t OSPEEDR; /*offset: 0x08 * / 
volatile uint32_ t PUPDR; /*offset: Ox0C * / 
volatile Uant32 E IDR; /*offset: 0x10 * / 
volatile uint3!2_t ODR; /*offset: 0x14 * / 
volatile uint32 t BSRR; /*offset: 0x18 * / 
volatile uint32_ t LCKR; /*offset: 0x1C * / 
volatile uint32 t AFRL; /*offset: 0x20 */ 
volatile uint32_ t AFRH; /*offset: 0x24 * / 


} GPIO TypeDef ; 


Let’s break down the syntax: 


The line typedef struct begins the definition of a new structure type. typedef is used to 
create an alias for the structure, allowing us to use GPIO_TypeDef as a type name later in the code. 


Each member of the structure is declared as volatile uint32_t. Heres the breakdown: 


e volatile: This keyword indicates that the value of the variable can change at any time, often 
due to hardware changes. The compiler should not optimize accesses to this variable. 
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e uint32_t: This indicates that each member of the structure is a 32-bit (4-byte) unsigned 
integer. This is important because the registers we are working with are also 32 bits in size. 
To ensure that the structure members accurately represent these registers, they must match 
this size. This alignment guarantees that each member corresponds correctly to its respective 
register in the memory map. 


Also note that the structure members are arranged in the same order and have the same size as the 
registers, as specified in the reference manual. 


Now, as we discussed in Chapter 2, to use any peripheral in the microcontroller, we first need to enable 
clock access to that peripheral. This is done through the RCC peripheral. Let’s create the structure 
for the RCC peripheral. 


This is our RCC structure: 


typedef struct 
volatile uint32 t DUMMY[12]; 
volatile uint32_ t AHB1ENR; /*offset: 0x30*/ 


} RCC_TypeDef; 


The RCC peripheral of the STM32F411 microcontroller has about 24 registers, which are not 
contiguous, leaving gaps in the memory region. The register we are interested in for the purposes of 
GPIO peripherals is the AHB1ENR register, which has an offset of 0x30. 


In our RCC_TypeDef, we have added padding to the structure with the number of uint32_t 
(4 bytes) items required to reach the offset 0x30. In this case, it is 12 items. This is because 4 bytes 
multiplied by 12 equals 48 bytes, which corresponds to 0x30 in hexadecimal notation. 


At this point, we have defined two important structures (GPIO_TypeDef and RCC_TypeDef) 
required to configure and control our GPIO pins. The next step involves creating pointers to the 
base addresses of the GPIO and RCC peripherals using these structures. This allows us to access and 
manipulate the peripheral registers in a structured and readable manner. Here is the code snippet 
that accomplishes this: 


#define RCC_BASE 0x40023800 
#define GPIOA BASE 0x40020000 
#define RES ( (RCE TypebDet*)) REC BASE) 


#define GPIOA ((GPIO_ TypeDef*)GPIOA BASE) 
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Let’s break it down: 


#define RCC BASE 0x40023800 

This line defines the base address of the RCC peripheral. The address value is taken from Table 6.1. 
#define GPIOA BASE 0x40020000 

This line defines the base address of the GPIOA peripheral. 

#define RCC ((RCC_TypeDef*) RCC_BASE) 


This line defines a macro, RCC, that casts the RCC_BASE base address to a pointer of type RCC_ 
TypeDef*. 


By doing this, RCC becomes a pointer to the RCC peripheral, allowing us to access its registers 
through the RCC_TypeDef structure. 


#define GPIOA ((GPIO TypeDef*) GPIOA BASE) 


Similarly, this line defines a macro, GPIOA, that casts the GPIOA_BASE base address to a 
pointer of type GPIO_TypeDef*. 


This makes GPIOA a pointer to the GPIOA peripheral, enabling access to its registers via the 
GPIO_TypeDef structure. 


With this accomplished, we are now ready to test out our implementation. Let’s do that in the next section. 


Evaluating the structure-based register access method 


Let’s update our previous project to use the structure-based register access approach: 


// 0: Include standard integer types header for fixed-width //integer 
types 
#include <stdint.h> 


// 1: GPIO_TypeDef structure definition 


typedef struct 


{ 


volatile uint32_t MODER, /*offset: 0x00 * / 
volatile uint32_ t OTYPER; /*offset: 0x04 * / 
volatile uint32_t OSPEEDR; /*offset: 0x08 * / 
volatile uint32_t PUPDR; /*offset: Ox0C * / 
Volatile Uantsi2 te TDR; /*offset: 0x10 * / 
volatile uint32_ t ODR; /*offset: 0x14 * / 
volatile uint32_t BSRR; /*offset: 0x18 * / 
volati kemuinto 2 EKR; /*offset: 0x1C */ 


volatile uint32 BARRE /*offset: 0x20 */ 
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volatile uint32 t AFRH; /*offset: 0x24 * / 
} GPIO_TypeDef; 


// 2: RCC_TypeDef structure definition 
typedef struct 
volatile uint32 t DUMMY[12]; 
volatile uint32_t AHB1ENR; /*offset: 0x30*/ 


} RCC_TypeDef; 


// 3: Base address definitions 
#define RCC_BASE 0x40023800 
#define GPIOA BASE 0x40020000 


// 4: Peripheral pointer definitions 
#define RCC ((RCC_TypeDef*) RCC_BASE) 
#define GPIOA ((GPIO_TypeDef*)GPIOA BASE) 


//5: Bit mask for enabling GPIOA (bit 0) 
#define GPIOAEN (1U<<0) 

//6: Bit mask for GPIOA pin 5 

#define PINS (1U<<5) 

//7: Alias for PIN5 representing LED pin 
#define LED PIN PINS 


// 8: Start of main function 
int main(void) 


{ 


// 9: Enable clock access to GPIOA 


RCC->AHB1ENR |= GPIOAEN; 
GPIOA->MODER |= (1U<<10); // 10: Set bit 10 to 1 
GPIOA->MODER &= ~(1U<<11); // 11: Set bit 11 to 0 


fi Mile Seerne Cit slineaigalice LOG3 
while (1) 
{ 
// 12: Set PA5 (LED PIN) high 
GPIOA->0DR*= LED PIN; 
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// 13: Simple delay 
for(int i=0;1<100000;i++) {} 


} 


In this new implementation, we access the required registers using the C structure pointer operator 
(->). Heres a breakdown of RCC- >AHB1ENR: 


e RCC: This is the pointer to a structure of type RCC_TypeDef. This pointer allows us to access 
the RCC registers using the structures members. 


e ->: This is the structure pointer operator in C. It is used to access a member of a structure 
through a pointer. 


e AHB1ENR: This is a member of the RCC_TypeDef structure. 


Similarly, we use the same approach to access GPIOA registers. 


Here’s a breakdown of GPIOA- >MODER and GPIOA->ODR: 


e GPIOA: This is a pointer to the structure of type GPIO_TypeDef, allowing access to 
GPIOA registers 


e MODER: A member of the GPIO_TypeDef structure, representing the GPIO port mode register 


e ODR: Another member of the GPIO_TypeDef structure, representing the GPIO port output 
data register 


Build the project and run it on your development board. It should work the same way as the 
previous project. 


In this section, we learned how to define hardware registers using C structures. This technique is an 
important step toward understanding CMSIS-compliant code, which will be covered in the next sections. 


Understanding CMSIS 


In this section, we will explore CMSIS, an important framework designed for Arm Cortex-M and some 
Cortex-A processors. CMSIS is a vendor-independent hardware abstraction layer that standardizes 
software interfaces across various Arm Cortex-based microcontroller platforms, promoting software 
portability and reusability. Let’s start by understanding CMSIS and its key benefits. 


Understanding CMSIS 


What is CMSIS? 


CMSIS, pronounced See-M-Sys, is a standard developed by Arm to provide a consistent and efficient 
way to interface with Cortex-based microcontrollers. It includes a comprehensive set of APIs, software 
components, tools, and workflows designed to simplify and streamline the development process for 
embedded systems. 


Its key benefits include the following: 


Standardization: CMSIS standardizes the interface for all Cortex-based microcontrollers, 
making it easier for you to switch between different microcontrollers and tools without having 
to relearn or reconfigure your code base 


Portability: By providing a consistent API, CMSIS allows software developed for one 
microcontroller to be easily ported to another, enhancing code reuse and reducing development time 


Efficiency: CMSIS includes optimized libraries and functions, such as Digital Signal Processing 
(DSP) and neural network kernels, which improve performance and reduce the memory footprint 


CMSIS has several components. Let’s explore some of the commonly used ones. 


Key components of CMSIS 


CMSIS comprises several components, each serving a unique purpose: 


CMSIS-Core (M): This component is designed for all Cortex-M and SecurCore processors. It 
provides standardized APIs for configuring the Cortex-M processor core and peripherals. It 
also standardizes the naming of device peripheral registers, which helps reduce the learning 
curve when switching between different microcontrollers. 


CMSIS-Driver: This provides generic peripheral driver interfaces for middleware, facilitating 
the connection between microcontroller peripherals and middleware such as communication 
stacks, filesystems, or graphical user interfaces. 


CMSIS-DSP: This component offers a library of over 60 functions for various data types, 
optimized for the Single Instruction Multiple Data (SIMD) instruction sets available on 
Cortex-M4, M7, M33, and M35P processors. 


CMSIS-NN: This stands for neural network. It includes a collection of efficient neural network 
kernels optimized to maximize the performance and minimize the memory footprint on 
Cortex-M processor cores. 


CMSIS-RTOS: There are two versions, RTOS v1 and RTOS v2. RTOS v1 supports Cortex-MO, 
M0+, M3, M4, and M7 processors, providing a common API for real-time operating systems. 
RTOS v2 extends support to Cortex-A5, A7, and A9 processors, including features such as 
dynamic object creation and multi-core system support. 


139 


140 


The Common Microcontroller Software Interface Standard (CMSIS) 


e CMSIS-Pack: Pack outlines a delivery system for software components, device parameters, 
and evaluation board support. It streamlines software reuse and facilitates effective product 
life cycle management. 


e CMSIS-SVD: SVD stands for System Viewer Description. It defines device description files 
maintained by the silicon vendor, containing comprehensive descriptions of the microcontroller 
peripherals and registers in XML format. Development tools import these files to automatically 
construct peripheral debug windows. 


Another crucial aspect of CMSIS is its coding standard. Let’s take a closer look to familiarize ourselves 
with these guidelines. 


The CMSIS coding rules 


Here are the essential coding rules and conventions used in CMSIS: 


e Compliance with ANSI C (C99) and C++ (C++03): This ensures compatibility with widely 
accepted programming standards. 


e Standard data types: Uses ANSI C standard data types defined in <stdint . hs, ensuring 
consistency in data representation. 


e Complete data types: Variables and parameters are defined with complete data types, 
avoiding ambiguity. 


e MISRA 2012 conformance: While CMSIS conforms to the MISRA 2012 guidelines, it does not 
claim full MISRA compliance. Any rule violations are documented for transparency. 


Additionally, CMSIS uses specific qualifiers: 


e __ I for read-only variables (equivalent to volatile const in ANSI C) 
e __ O for write-only variables 
e 10 for read and write variables (equivalent to volatile in ANSI C) 


These qualifiers in CMSIS provide a convenient way to specify the intended access mode for variables, 
particularly for memory-mapped peripheral registers. 


With this background knowledge, let's go ahead and learn how to use CMSIS in our embedded projects. 


The CMSIS-Core files 


The CMSIS-Core files are categorized into two primary groups, each serving a specific purpose in the 
development process. These groups are the CMSIS-Core standard files and the CMSIS-Core device 
files. Figure 6.1 provides a comprehensive overview of the CMSIS-Core file structure, illustrating the 
different types of files and their roles in a project. 


Understanding CMSIS 


User Program = CMSIS-Core Standard Files (Arm) 
Figure 6.1: The CMSIS-Core files 


=) CMSIS-Core Device Files (Silicon Vendor) 


Let’s analyze the diagram. 


The files are divided into three categories: CMSIS-Core Standard Files, CMSIS-Core Device Files, 
and User Program Files. 


The CMSIS-Core Standard Files category of files is provided by Arm and typically does not require 
modifications. They include the following: 


e core _<cpu>.h: This provides access to the CPU and core-specific functionalities 


e cmsis_ compiler. h: This contains core peripheral functions, CPU instruction access, and 
SIMD instruction access 


e <arch>_<feature>.h: This defines architecture-specific attributes and features 
e cmsis <compiler>_m.h: This is a toolchain-specific file that aids in compiler compatibility 


and optimization 


The next category is CMSIS-Core Device Files. These files are provided by silicon vendors (such as 
STMicroelectronics) and may require application-specific modifications. They include the following: 


e system_<Device>.c: This file handles system and clock configuration 

e partition_<Device>.h: This manages secure attributes and interrupt assignments 
e startup_<Device>.c: This contains the device startup interrupt vectors 

e <Device>.h: This provides access to the CMSIS device peripheral functionalities 


e system_<Device>.h: This assists in system and clock configuration 
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The third category is User Program Files. These are created by us, the developers, and include the 
main application code along with other user-defined functionalities essential for the project’s operation. 


Understanding CMSIS is fundamental for efficient and standardized development across various Arm 
Cortex-based microcontroller platforms. In this section, we have explored its key components, the 
CMSIS coding standards, and the CMSIS-Core files crucial to the development of embedded projects. 
In the next section, we will learn how to include the required CMSIS files in our project, enabling us 
to leverage the full potential of this robust framework for developing embedded systems. 


Setting up the required CMSIS files 


In this section, we will work through the process of integrating CMSIS files into our project. These 
files also contain the definitions of all the registers and their respective bits, making it easier to manage 
and configure peripherals without manually defining each register. 


Getting the right header files 
Let's start by downloading the package for our microcontroller from the STMicroelectronics website: 


1. Open your browser and go to https: //www.st.com/content/st_com/en.html. 
2. Search for STM32CubeF4 to locate the package for the STM32F4 microcontroller series. 
3. Next, download the STM32CubeF4 package: 
I. Locate the STM32CubeF4 package and download the latest version. Make sure not to 
download the patch version. 
II. Once the download is complete, unzip the package. You will find several subfolders 
inside, including the Drivers folder. 


Our next steps involve organizing the files: 


1. Inside your project workspace, create a new folder named chip headers. 
Within the chip headers folder, create another folder named CMSTS. 
Navigate to Drivers/CMSTS in the extracted package. 


Copy the entire Include folder from Drivers/CMSIS into chip_headers/CMSIS. 


Ole ee ee aS 


Next, copy the Device folder from Drivers/CMSTIS into chip _headers/CMSIS. 
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Finally, we clean up the Device folder: 


1. 
2. 


Navigate to chip_headers/CMSIS/Device/ST/STM32F4xx. 


Delete all files and folders, except the Include folder. This ensures you keep only the necessary 
header files for your specific microcontroller. 


Moving forward, we will consistently include the chip headers folder in our project directories. 
This practice ensures that anyone running our projects on a different computer won't encounter errors 
due to missing header files. 


In the next section, we'll create a new project in STM32CubeIDE. After setting up the project, we'll 
copy the chip_headers folder and paste it into the project directory. Subsequently, we'll add the 
subfolders within chip headers to our project's include paths, ensuring seamless integration 
of the necessary CMSIS files. 


Working with CMSIS files 


In this section, we will integrate the CMSIS files into our project by adding the relevant folders to 
our project’s include paths. We will then test our setup by updating our main. file to use these 
CMSIS files instead of our manually defined peripheral structures. 


Let’s begin by creating a new project following the steps outlined in Chapter 1.1 will name my project 
header_files. Once the project is created, Pll copy and paste the chip headers folder into 
the project folder. 


Our next task involves adding the paths of the subfolders within the chip_headers folder to our 
project’s include paths: 


1. 


2 
3 
4. 
5 


oS 


Open STM32CubeIDE, right-click on your project, and select Properties. 
Once the Properties window opens, expand the C/C++ General options. 
Select Paths and Symbols. 

Under the Includes tab, click Add to add a new directory. 


Enter ${ProjDirPath}\chip headers\CMSIS\Include to add the Include 
folder located in our CMSIS folder, and then click OK to save. 


Click Add again to add another directory. 


Enter ${ProjDirPath}\chip headers\CMSIS\Device\ST\STM32F4xx\ Include 
to add the Include folder located in the STM3 2F4xx subdirectory and click OK to save. 
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Figure 6.2 illustrates the Project Properties window after adding these directories to the project’s 
include paths. 


(05 Properties for Chapter6-HeaderFiles oO x 
| | Paths and Symbols y ris 
Resource 
Builders 
C/C++ Build Configuration: Debug [ Active] v | Manage Configurations... 
v C/C++ General 
Code Analysis r 
Documentation B Includes | # Symbols | æ Libraries | Œ Library Paths | GS Source Location | 5) References | 
File Types 
Formatter Languages Include directories Add... 
Indexer GNU C | {Bine r 
Language Mappings Assembly (2. $(ProjDirPath)\chip_headers\CMSIS\Include = 
Paths and Symbols (© $(ProjDirPath)\chip_headers\CMSIS\Device\ST\STM32F4xx\... Delete 
Preprocessor Include Pat 
CMSIS-SVD Settings Export 
Project References 
Run/Debug Settings Move Up 
Move Down 
Using relative paths is ambiguous and not recommended. It can cause unexpected effects. 
M Show built-in values 
@ Import Settings... | EX Export Settings... 
r 5 Restore Defaults Apply 
O) Apply and Close Cancel 


Figure 6.2: The Includes tab in the project properties window 


Let’s break down the two lines we have just added: 
e ${ProjDirPath}\chip headers\CMSIS\Include: 


* ${ProjDirPath}: This is a macro in STM32CubeIDE that represents the root directory 
of your current project. It’s a placeholder that dynamically points to the directory where 
your project is located. 
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* \chip headers\CMSIS\Include: This specifies the path relative to the project’s 
root directory. It points to the Include folder within the CMSTS directory inside the 
chip headers folder. This folder contains general CMSIS include files, which provide 
core functionalities and definitions for the Cortex-M processor. 


e ${ProjDirPath}\chip headers\CMSIS\Device\ST\STM32F4xx\ Include: 


* ${ProjDirPath}: As mentioned earlier, this macro represents the root directory of 
your current project. 


* \chip headers\CMSIS\Device\ST\STM32F4xx\ Include: This specifies 
another path relative to the project's root directory. It points to the Include folder within 
the STM3 2F4xx subdirectory inside the Device directory, which is located in the CMSIS 
directory in the chip headers folder. This folder contains device-specific include files 
for the STM32F4xx series microcontrollers, providing definitions and configurations specific 
to this family of microcontrollers. 


Before closing the project properties dialog box, we need to specify the exact version of the STM32F4 
microcontroller we are using. This ensures that the appropriate header file for our specific microcontroller 
is enabled in our project. As we can see, the STM32F4xx\ Include subfolder contains header files 
for various microcontrollers within the STM32F4 family. The NUCLEO-F411 development board 
has the STM32F411 microcontroller. 


To configure our project for the STM32F411 microcontroller: 
1. Click on the # Symbols tab. 
Click Add... to add a new symbol. 


2 
3. Inthe Add Symbol dialog box, enter STM32F411xE in the Name field and then click OK. 
4 


Click Apply and Close to save all changes and close the Project Properties window. 
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Figure 6.3 illustrates the Symbols tab of the project properties window after adding the 
STM32F411xE symbol. 
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Figure 6.3: The Symbols tab in the project properties window 


To test our setup, let’s follow these steps: 
1. Copy the entire content of the main. c file from our previous project that used the structure- 
based access approach. 
Open the main.c file in the current project. 
Delete all the existing content in the main.. c file. 


Paste the copied content into the main.c file. 


Sr eee oh. 


In the main.. c file, delete all the code related to manually defined addresses and structures, 
as we will now use the register definitions provided in our header files. 
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6. Include the stm32£4xx.h header file in your project to access these definitions. 


7. Build the project in the IDE and run it on the development board. 


The following snippet shows the updated main . file: 


//1: Include the stm32f4 header file 
#include "stm32f4xx.h" 
//2: Bit mask for enabling GPIOA (bit 0) 


#define GPIOAEN (1U<<0) 
//3: Bit mask for GPIOA pin 5 
#define PIN5 (1U<<5) 
//4: Alias for PIN5 representing LED pin 
#define LED PIN PIN5 
int main (void) 
{ 
// 5: Enable clock access to GPIOA 
RCC- >AHB1ENR |= GPIOAEN; 
// 6: Set PA5 to output mode 
GPIOA->MODER |= (1U<<10); 
GPIOA->MODER &= ~(1U<<11); 
while (1) 
{ 
Hi We See PA5 (LED PIN) high 


GPIOA->0DR*= LED PIN; 
// 8: Simple delay 
for(int i=0;i<100000;i++) {} 


} 


You'll observe that this project builds without any errors and works in the same way as our previous 
one. This implementation accesses the microcontroller registers defined in chip _headers/CMSIS/ 
Device/ST/STM32F4xx/Include/stm32f411xe.h. 


Upon inspecting this file, you will find that it contains a typedef structure for each peripheral 
of our microcontroller, similar to the one we manually created a few sections ago. This means that 
moving forward, we don’t need to manually extract the base addresses and register offsets from the 
documentation. Instead, we can simply include the stm32£4xx. h header file in our project. This 
header file will in turn include the stm32£411xe .h file because we specified in the Symbols tab 
of the project properties window that we are using the STM32F41 1xE microcontroller. 


Setting up the required header files significantly simplifies configuring and using peripherals on our 
STM32 microcontroller. This approach allows us to leverage pre-defined register addresses and bit 
definitions, making our code more readable and maintainable while also reducing development time. 
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Summary 


In this chapter, we explored CMSIS, a critical framework for Cortex-M and some Cortex-A processors. 
This chapter gave us the foundational knowledge to enhance code portability and efficiency in our 
Arm Cortex-M projects. 


We began by learning how to define hardware registers using C structures, a fundamental technique 
for improving code readability and maintainability. This knowledge allowed us to understand how 
the CMSIS-compliant header files provided by microcontroller manufacturers give us access to the 
register definitions. 


Next, we explored CMSIS itself, discussing its components and how it facilitates efficient software 
development. We examined the key benefits of CMSIS, such as standardization, portability, and 
efficiency, and introduced its main components, including CMSIS-Core, CMSIS-Driver, CMSIS-DSP, 
CMSIS-NN, CMSIS-RTOS, CMSIS-Pack, and CMSIS-SVD. 


We then moved on to setting up the necessary CMSIS files from our silicon manufacturer. This process 
involved downloading the relevant packages, organizing the files, and integrating them into our project. 


Finally, we tested our setup by updating our previous project to use CMSIS files instead of our manually 
defined peripheral structures. This practical application showcased how CMSIS simplifies accessing 
microcontroller registers, making the code more readable, maintainable, and efficient. 


In the next chapter, we will learn about the GPIO peripheral. This chapter will provide a comprehensive 
understanding of how to configure and use GPIO for input/output applications in embedded systems. 
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The General-Purpose Input/ 
Output (GPIO) Peripheral 


In this chapter, we will explore the General-Purpose Input/Output (GPIO) peripheral, an essential 
component in microcontrollers. This peripheral is crucial for interfacing with microcontrollers, making 
it fundamental to embedded systems development. 


We will start by exploring the organization of GPIO ports and pins, covering both the general-purpose 
and alternate functions of these pins. Next, we will examine the key registers associated with the GPIO 
peripheral in STM32 microcontrollers. Finally, we will apply this knowledge to develop input and 
output drivers using the detailed register information we learn in this chapter. 


In this chapter, we will cover the following main topics: 
e Understanding the GPIO peripheral 
e The STM32 GPIO registers 


e Developing input and output drivers 


By the end of this chapter, you will be able to use the GPIO peripheral to interface effectively with 
microcontrollers, which will enable you to handle various input and output tasks with confidence. 


Technical requirements 


All the code examples for this chapter can be found on GitHub at https: //github.com/ 
Packt Publishing/Bare-Metal -Embedded-C- Programming. 
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Understanding the GPIO peripheral 


Since we introduced the GPIO peripheral in Chapter 2, this section will reiterate the key points to 
remember regarding GPIOs. 


Microcontroller pins are grouped into ports. For instance, a microcontroller might have ports named 
GPIOA, GPIOB, and GPIOC. See Figure 2.10 in Chapter 2. Each port is composed of individual pins, 
which are referred to by their port name, followed by their pin number. The following are examples 
of this naming convention: 


e PAI refers to port A, pin 1 
e PD7 refers to port D, pin 7 


This naming convention helps in identifying and configuring specific pins for various functions. 


The STM32F411xC/E microcontroller series features six ports: PORTA, PORTB, PORTC, PORTD, 
PORTE, and PORTH. Each port is equipped with a comprehensive set of registers to manage 
configuration, data handling, and functionality. 


These ports offer a variety of features designed for versatility and performance. The features offered 
include the following: 


e I/O control: They allow us to manage up to 16 input/output pins per port. 


e Output states: Pins can be configured for push-pull or open-drain modes, with optional pull-up 
or pull-down resistors. 


¢ Output data: Output data is driven by the GPIOx_ODR register when the pin is configured 
as a general-purpose output. For alternate function configurations, the associated peripheral 
drives the output data. 


e Speed selection: The operating speed for each I/O pin can be set. 
e Input states: Pins can be configured as floating, pull-up, pull-down, or analog inputs. 


e Input data: Data can be read from the GPIOx_IDR register or an associated peripheral when 
configured for alternate function input. 


¢ Configuration locking: The GPIOx_LCKR register can be used to lock the I/O configuration, 
preventing accidental changes. 


e Alternate function selection: Up to 16 alternate functions per I/O pin can be configured, 
providing flexibility in pin usage. 


In the next section, we shall explore some of the GPIO registers of the STM32F411 microcontroller. 


The STM32 GPIO registers 


The STM32 GPIO registers 


In this section, we will explore the characteristics and functions of some of the common registers 
within the GPIO peripheral. 


Each GPIO port includes a set of 32-bit registers essential for configuration and control. The configuration 
registers comprise the following: 


e GPIOx_MODER (mode register) 
e GPIOx_OTYPER (output type register) 
e GPIOx_OSPEEDR (output speed register) 


e GPIOx_PUPDR (pull-up/pull-down register) 
The data registers include the following: 


e GPIOx_IDR (input data register) 


e GPIOx_ODR (output data register) 


GPIOx_BSRR (the bit-set/reset register) and GPIOx_LCKR (the locking register) are used to 
control pin states and access. Additionally, the alternate function selection registers, GPIOx_AFRH 
and GPIOx_AFRL, manage the alternate function assignments for the pins within the GPIO port. 


Let's start with the GPIO mode register. 


The GPIO mode register (GPIOx_MODER) 


The GPIO port mode register (GPIOx_MODER) is an important register for configuring the mode of 
each pin in the GPIO port. This register allows us to set each pin in different modes, such as input, 
output, alternate function, or analog. 


It is a 32-bit register divided into 16 pairs of bits. Each pair of bits corresponds to a specific pin in the 
GPIO port, allowing the individual configuration of each pin. See Figure 2.17 in Chapter 2. 


The possible configurations for these bits are the following: 


e 00: Input mode (reset state) 


In this mode, the pin is configured as an input, which can be used to read signals from external 
devices such as a push button. 


e 01: General-purpose output mode 


This mode configures the pin as an output, which can be used to drive external signals or 
components such as an LED. 
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e 10: Alternate function mode 


This mode sets the pin to an alternate function, allowing it to interface with various peripherals 
such as UART, SPI, or I2C. 


e 11: Analog mode 


This mode configures the pin for analog input, which is useful for analog-to-digital converter 
(ADC) operations. 


Let’s see a practical example. 
Consider configuring a pin (e.g., PA5) on port A: 
1. To set PA5 as a general-purpose output (01), we can follow these steps: 
Locate the bit pair corresponding to PAS (MODERS([1:0]); these are bit11 and bit10 
Write 0 to bit11 and 1 to bit10 
2. To set PA5 as an alternate function (10), we should write 1 to bit11 and 0 to bit10. 
3. To set PA5 as an analog input (11), we should write 1 to bit11 and 1 to bit10. 
This is all there is to know about the GPIOx_MODER register. 
Let’s move on to examine two other important registers: the output data register (GPIOx_ODR) and 


the input data register (GPIOx_IDR). 


The GPIO output data register (GPIOx_ODR) and the GPIO input 
data register (GPIOx_IDR) 


The GPIO output data register (GPIOx_ODR) and GPIO input data register (GPIOx_IDR) are essential 
for managing the data flow through the GPIO pins. These registers allow for reading the state of pins 
and setting the state of pins, enabling our microcontroller to interact with external devices effectively. 


GPIOx_ODR is a 32-bit register, but only the lower 16 bits are used to control the output state of the 
pins. Each bit in the register corresponds to a pin in the GPIO port. 


By writing to this register, we can set the logic level (high or low) of each pin configured as an 
output. Figure 7.1 shows the structure of the GPIO output data register (ODR) extracted from the 
reference manual. 
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31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 
Reserved | 
15 10 9 8 7 6 5 4 3 2 1 0 


ODR15 


14 13 12 11 
rw rw rw rw rw rw rw rw rw rw rw rw rw rw rw 


Figure 7.1: GPIO ODR 


Take the following examples: 


e Writing 1 to bit5 (ODR5) sets PAS to a high state 


e Writing 0 to bit5 (ODR5) sets PA5 to a low state 


How about the GPIO input data register? 


The GPIO input data register (GPIOx_IDR) is used to read the current state of the GPIO pins 
configured as inputs. By reading from this register, we can determine whether each input is at a high 
or low logic level. 


It is a 32-bit register, but similar to the ODR, only the lower 16 bits are used to read the state of the 
pins. Each bit in the register corresponds to a pin in the GPIO port. 


A bit value of 1 indicates that the corresponding pin is at a high logic level, while a bit value of 0 
indicates that it is at a low logic level. 


Figure 7.2 shows the structure of the GPIO input data register: 


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 
Reserved | 

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 
IDR15 | IDR14 | 1DR13 | IDR12 | IDR11 | IDR10 | IDR9 | IDR8 | IDR7 | IDR6 | IDR5 | IDR4 | IDR3 | IDR2 | IDR1 | IDRO | 
r | r F r r | r | F | r F | d | r | r r | r | r r | 


Figure 7.2: GPIO input data register 


Take the following examples: 


e Ifbit5(IDR5) reads 1, PAS is at a high state 


e Ifbit5(IDR5) reads 0, PAS is at a low state 


Another commonly used register is the GPIO bit set/reset register (GPIOx_BSRR). Let’s examine 
this register in the next section. 
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The GPIO bit-set/reset register (GPIOx_BSRR) 


The GPIO bit-set/reset register (GPIOx_BSRR) is a crucial register for controlling the state of GPIO 
pins. It provides atomic bitwise operations to set or reset individual bits, which ensures that no 
interrupts can disrupt the operation, maintaining data integrity during modifications. 


GPIOx_BSRR is a 32-bit register divided into two 16-bit sections: 


e Bits 15:0 (BSy): These bits are used to set the corresponding GPIO pin 
e Bits 31:16 (BRy): These bits are used to reset the corresponding GPIO pin 


Figure 7.3 shows the structure of the GPIO bit-set/reset register (BSRR). 


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 
BR15 | BR14 | BR13 | BR12 | BR11 | BR10 | BR9 | BR8 | BR7 | BR6 | BR5 | BR4 | BR3 | BR2 | BR1 | BRO 
w Ww w w w w w w w w w Ww w w w w 
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 
Bs15 | Bs14 | Bs13 | Bs12 | Bs11 | Bs10 | Bso | Bse | Bs7 | sse | sss | Bs4 | Bs3 | Bs2 | Bs1 | Bso 
Ww | Ww | w w | w | w | w w | w | w | w | Ww | w | w w | w 


Figure 7.3: BSRR 


Let’s analyze the bits in the register: 


e Bits 31:16 (BRy): Writing 1 to any of the upper 16 bits resets the corresponding pin to a low state 
e Bits 15:0 (BSy): Writing 1 to any of the lower 16 bits sets the corresponding pin to a high state 


For example, let’s see how to set PA5 as high and low using the BSRR: 


e To set PAS as high: We write 1 to bit5(BS5) in the GPIOA_BSRR register 
e To set PAS as low: We write 1 to bit21(BR21) in the GPIOA_BSRR register 
The GPIO BSRR (GPIOx_BSRR) provides a robust mechanism for controlling the state of GPIO pins. 


By understanding its structure and functionality, we can perform efficient and atomic operations to 
set or reset individual pins. 


The STM32 GPIO registers 


Another pair of commonly used GPIO registers are the GPIO alternate function high and alternate 
function low registers. Let’s explore them in the next section. 


The GPIO alternate function registers (GPIOx_AFRL and GPIOx_ 
AFRH) 


The GPIO alternate function registers (AFRs) are important for configuring the alternate functions 
of the GPIO pins in STM32 microcontrollers. These registers allow each pin to be assigned a specific 
peripheral function, enhancing the versatility and functionality of the microcontroller. 


Each GPIO port has two AFRs: 


e GPIOx_AFRL: Alternate function low register (AFRL), for pins 0 to 7 
e GPIOx_AFRH: Alternate function high register (AFRH), for pins 8 to 15 


These registers enable the selection of alternate functions for each pin, facilitating the use of the pins 
for various peripheral interfaces such as UART, I2C, and SPI. 


GPIOx_AFRL is a 32-bit register, divided into eight 4-bit fields. Each 4-bit field corresponds to one 
pin in the range of pins 0 to 7 in the GPIO port. 


GPIOx_AFRH is also a 32-bit register, divided into eight 4-bit fields. Here, each 4-bit field corresponds 
to one pin in the range of pins 8 to 15 in the GPIO port. 


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 


AFRL7[3:0] AFRLE6[3:0] AFRL5[3:0] 
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 
AFRL3{3:0] AFRL2[3:0] AFRL1[3:0] AFRLO[3:0] 


Figure 7.4: Alternate function low register 


Figure 7.4 illustrates the AFRL. To understand the various alternate functions each pin can assume 
based on the values of the corresponding 4-bit fields, we will refer to Figure 7.5 as our guide: 
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AFO (system) 

AF1 (TIM1/TIM2) 

AF2 (TIM3..5) 

AF3 (TIM9..11) 

AF4 (12C1..3) 

AF5 (SPI1..4) 

AF6 (SPI3..5) Pin x (x = 0..7) 
AF7 (USARTL..2) 

AF8 (USART6) i 
AF9 (12C2..3) 

AF10 (OTG_FS) 

AF11 

AF12 (SDIO) 

AF13 

AF14 

AF15 (EVENTOUT) 


| 


AFRL(31:0] 


For pins 8 to 15, the GPIOx_AFRH[31:0] register selects the dedicated alternate function 


AFO (system) 

AF1 (TIM1/TIM2) 

AF2 (TIM3..5) 

AF3 (TIM9..11) 

AF4 (I2C1..3) 

AF5 (SPI1..4) 

AF6 (SPI3..5) Pin x (x = 8..15) 
AF7 (USART1..2) 

AF8 (USART6) A 
AF9 (12C2..3) 

AF10 (OTG_FS) 

AF11 

AF12 (SDIO) 

AF13 

AF14 

AF15 (EVENTOUT) 


| 


AFRH[31:0] 


Figure 7.5: Alternate function selection 


The STM32 GPIO registers 


Figure 7.5 shows a diagram with two multiplexers. The first multiplexer represents the selector for 
the AFRL, while the second represents the selector for the AFRH. This diagram is sourced from page 
150 of the reference manual. It effectively demonstrates how to configure GPIO pins for alternate 


functions on STM32F411 microcontrollers using these registers. 


Let’s break down what we see. 


For both GPIOx_AFRL and GPIOx_AFRH, the diagram provides a list of the possible alternate 
functions that can be selected for each pin. The alternate functions are designated by AFO through 


AF15, each associated with specific peripheral functions, as follows: 


Alternate function description Binary value 
AFO: System functions 0000 
AF1: TIM1/TIM2 0001 
AF2: TIM3/TIM4/TIM5 0010 
AF3: TIM9/TIM10/TIM11 0011 
AF4: 12C1/12C2/12C3 0100 
AF5: SPI1/SP12/SPI3/SP14 0101 
AF6: SPI3/SP14/SPI5 0110 
AF7: USART1/USART2 0111 
AF8: USART6 1000 
AF9: 12C2/I2C3 1001 
AF10: OTG_FS 1010 
AF11: Reserved 1011 
AF12: SDIO 1100 
AF13: Reserved 1101 
AF14: Reserved 1110 
AF15: EVENTOUT 1111 


Table 7.1: Alternate function selection 


For example, to configure pin 5 (PAS) to be used by TIM9 (AF3), we write the value 0011 to bits 
[23:20] of the GPIOA_AFRL register. We use GPIOA_AFRL because PAS falls within the range of 
pins 0 to 7, which are managed by this register. To configure pin 10 (PA10) to be used by the USART1 
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peripheral (AF7), we write the value 0111 to bits [7:4] of the GPIOA_AFRH register. This is because 
PA10 falls within the range of pins 8 to 15, which are managed by the GPIOA_AFRH register. 


In this section, we explored the characteristics and functions of several essential registers within 
the STM32 GPIO peripheral. We began with the GPIO mode register (GPIOx_MODER), which 
configures the mode of each GPIO pin, allowing settings for the input, output, alternate function, 
or analog mode. We then examined the GPIO output data register (GPIOx_ODR) and input data 
register (GPIOx_IDR), which manage the data flow through the GPIO pins by setting and reading 
pin states. Next, we looked at the GPIO BSRR (GPIOx_BSRR), which provides atomic operations for 
setting and resetting individual pin states. Finally, we covered the GPIO alternate function registers 
(GPIOx_AFRL and GPIOx_AFRH), which assign specific peripheral functions to each pin, enhancing 
the microcontroller’s versatility. 


In the next section, we will develop GPIO drivers using the knowledge gained in this section. Specifically, 
we will focus on creating input and output drivers. We will explore the practical usage of the alternate 
function registers in the chapters dedicated to communication peripherals. 


Developing input and output drivers 


In this section, we will apply the knowledge gained about the GPIO peripheral to develop practical 
input and output drivers for STM32 microcontrollers. Since we are already familiar with developing 
the output driver to toggle an LED using the ODR, this section will focus on developing the output 
driver using the BSRR. 


The GPIO output driver using the BSRR 
Let’s start by making a copy of our last project in our IDE: 


1. Right-click on the last project and select Copy. 
2. Right-click in the Project Explorer pane and select Paste. 


3. Rename the copied project to GpioInput -Output. 
Next, we will modularize our code by creating dedicated files for the GPIO driver code: 


1. Right-click on the Src folder in the project and select New | File. 
2. Inthe File Name field, type gpio.c. 


Then, we will create the corresponding header file: 


1. Right-click on the Inc folder in the project and select New | File. 
2. Inthe File Name field, type gpio.h. 
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Our next task is to implement the driver code in the gpio.c file and declare the public functions 
in the gpio.h file. 


Populate your gpio . c with the following code: 


#include "gpio.h" 


#define GPIOAEN (1U<<0) 
#define LED _BS5 (1U<<5) /*Bit Set Pin 5*/ 
#define LED BR5 (1U<<21) /*Bit Reset Pin 5*/ 


void led_init (void) 

{ 
/*Enable clock access to GPIOA*/ 
RCC- >AHB1ENR l= GPIOAEN; 
/*Set PA5 mode to output mode*/ 
GPIOA- >MODER |S Gueto) g 
GPIOA->MODER &=~(1U<<11); 


} 


void led_on(void) 


/*Set PA5 high*/ 
GPIOA->BSRR |=LED_BS5; 


void led_off (void) 


/*Set PAS high*/ 
GPIOA->BSRR |=LED_BR5; 


} 


Let’s break down the unique elements of the gpio . c file, focusing on the usage of the BSRR. We 
start with the header file inclusion: 


#include "gpio.h" 


This line includes the gpio.h header file, which in turn includes stm32£xx . h to provide access 
to the register definitions. 


Next, we define all the macros we need: 


#define GPIOAEN (1U<<0) 
#define LED _BS5 (1U<<5) /* Bit Set Pin 5 */ 
#define LED _BR5 (1U<<21) /* Bit Reset Pin 5 */ 
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Let’s break these macros down: 


e GPIOAEN: This macro is used to enable the clock for GPIOA by writing to the AHB1ENR register. 


e LED _BS5: This macro represents the bit-set operation for pin PA5. Writing this value to the 
BSRR sets PAS to high, turning the LED on. 


e LED _BR5: This macro represents the bit-reset operation for pin PA5. Writing this value to the 
BSRR resets PAS to low, turning the LED off. 


Next, we implement the function for turning on the LED: 


void led_on(void) 


/* Set PAS high */ 
GPIOA->BSRR |= LED BS5; 


} 


The GPIOA->BSRR |= LED_BS5 line uses the BSRR to set PAS to high. Writing 1 to bit 5 of the 
BSRR sets the corresponding pin (PA5) to high, turning the LED on. 


And then we implement the function to turn the LED off: 


void led_off (void) 


{ 


/* Set PAS low */ 
GPIOA->BSRR |= LED_BR5; 


} 


Similarly, GPIOA->BSRR |= LED_BRS5 uses the BSRR to reset PAS to low. Writing 1 to bit 21 of 
the BSRR resets the corresponding pin (PAS) to low, turning the LED off. 


Here is the content for the gpio .h file: 


#ifndef GPIO H_ 
#define GPIO H_ 


#include "stm32f4xx.h" 
void led_init (void) ; 
void led_on(void) ; 


void led_off (void) ; 


#endif /* GPIO H_ */ 


Developing input and output drivers 


Let’s break it down, starting with the header guard: 


#ifndef GPIO H_ 
#define GPIO H_ 


#endif /* GPIO H_ */ 


The header guards prevent multiple inclusions of the same header file, which can lead to errors and 
redundant declarations. The #ifndef GPIO_H_ directive checks whether GPIO_H_ has been 
defined yet. If it hasn't, it proceeds to define GPIO_H_ and includes the rest of the file. The #endif 
directive at the end closes the conditional directive that began with #ifndef. 


Next, we have the include directive: 


#include "stm32f4xx.h" 


This directive includes the stm32£4xx .h header file, which in turn includes the stm32£411xe.h 
header file, which provides definitions and declarations for all the registers in our microcontroller. 


And then we have the function declarations: 


void led_init (void) ; 
void led_on(void) ; 
void led_off (void) ; 


These declarations allow us to access the functions defined in the gpio . c file from other files, such 
asmain.c. 


Now that our GPIO output driver for PA5 is complete, let’s test it by updating the main . file to call 
the functions defined in the gpio.c file. Here is the updated main. c code: 


#include "gpio.h" 
int main (void) 
{ 
/*Initialize LED*/ 
Lecl iaten) p 
while (1) 
{ 
econ: @)s; 
for(int i = 0; i < 100000; i++) {} 
led_off(); 
for(int i = 0; i < 100000; i++) {} 
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The code begins by including the gpio.h header file to access the GPIO functions defined in 
gpio.c. Within the main function, it first calls led_init () to initialize PA5 as an output pin. 
Then, it enters an infinite loop where it alternately turns the LED on and off by calling Led_on () 
and led_of f (), respectively. We use simple delay loops between these calls to keep the LED on 
and off for visible durations, effectively making the LED blink continuously. 


Proceed to build the project and run it on the development board. You should see the green LED blinking. 
In the next section, we shall develop the GPIO input driver using PC13. We are using PC13 because 
the blue push button of the development board is connected to this pin. 


The GPIO input driver 


Let’s start by analyzing the initialization function. Add this function to the gpio.c file: 


#define GPIOAEN (1U<<0) 
#define GPIOCEN (1U<<2) 
#define BIN PIN (1U<<13) 


void button_init (void) 


/*Enable clock access to PORTC*/ 
RCC- >AHB1ENR | =GPIOCEN; 
/*Set PC13 as an input pin*/ 
GPIOC->MODER &=~(1U<<26) ; 
GPIOC->MODER &=~ (1U<<27) ; 

} 


This function enables clock access to GPIO port C and configures pin PC13 as an input pin: 
e GPIOC->MODER: This is the GPIO port mode register for port C. Each pair of bits in this 
register corresponds to the mode configuration of a specific pin. 


e Clearing bits 26 and 27: The bits corresponding to pin 13 in the GPIOC->MODER register 
are bits 26 and 27. 


The bitwise AND operator combined with the bitwise NOT operator (&=~) clears these bits, 
setting them to 00. As we learned earlier, configuring these bits to 00 sets PC13 as an input pin. 


Next, add the function for reading the state of the pin: 


bool get_btn_state (void) 


{ 
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/*Note : BTN is active low*/ 


/*Check if button is pressed*/ 
if (GPIOC->IDR & BTN_PIN) 


{ 

return false; 
} 
else 
{ 

return true; 
} 


} 


This function reads the state of the button. The button is internally connected as an active-low input. 
This means the pin reads at a low logic level (0) when the button is pressed and a high logic level (1) 
when it is not pressed. 


e GPIOC->IDR: This is the input data register for GPIO port C. It holds the current state of all 
the pins in the port. 


e Bitwise AND Operator (&): This checks whether the bit corresponding to BTN_PIN in the 
IDR register is set to high. 


e Return values: 


* false: If the bit is set, the button is not pressed 


* true: If the bit is not set, the button is pressed 


To be able to access these new functions from other files, such as main. c, we need to add their 
prototypes to the gpio.h file. 


Add the following lines to gpio. h: 


#include <stdbool.h> 
woulel leybiwicreral abalabic; (G¥e)6))) 5 
bool get_btn_state(void) ; 


Now, let’s test the new functions by updating the main . file to call them. 
The following is the updated main. c code: 


#include "gpio.h" 


bool btn_state; 
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int main(void) 


{ 
/*Initialize LED*/ 
Ved aime); 


/*Initialize Pushbutton*/ 
Düuttconminit Or 


while (1) 


{ 


/*Get Pushbutton State*/ 
owa erete = cet dwn ELeCal y 


if (btn_state) 
} 
else 


{ 
} 


Lodmons®E, 


led_off(); 


} 
Let’s break it down: 


e bool btn_state: This variable holds the state of the push button 
e led_init (): Configures PA5 as an output pin 
e button_init (): Configures PC13 as an input pin 
e while(1) {..}: 
* Continuously reads the state of the push button using get_btn_state() 


" Turns the LED on if the button is pressed (btn_ state is true) 


" Turns the LED off if the button is not pressed (btn_state is false) 


Build the project and run it on the development board. You should see the green LED light up only 
when you press the push button. 


Summary 


Summary 


In this chapter, we explored the GPIO peripheral, a critical peripheral in microcontrollers that is essential 
for interfacing with various external components. We began by understanding the organization of 
GPIO ports and pins, covering both general-purpose and alternate functions. 


The STM32F411 microcontroller series features several ports, each equipped with registers to manage 
configuration, data handling, and functionality. 


We introduced the registers associated with the GPIO peripheral, including configuration registers 
such as GPIOx_MODER, GPIOx_OTYPER, GPIOx_OSPEEDR, and GPIOx_PUPDR, as well as 
data registers such as GPIOx_IDR and GPIOx_ODR. We also covered GPIOx_BSRR (bit-set/reset 
register) for atomic pin state control and GPIOx_LCKR (locking register) for preventing accidental 
configuration changes. Additionally, we explored the GPIO alternate function registers (GPIOx_AFRL 
and GPIOx_AFRH), which enable versatile pin usage by assigning specific peripheral functions. 


In practical terms, we developed both output and input drivers. We first created an output driver 
using the GPIOx_BSRR register to control the LED connected to pin PAS. This involved setting up 
the necessary macros, implementing initialization and control functions, and testing the driver by 
making the LED blink. We then developed an input driver for reading the state of the push button 
connected to pin PC13. This included configuring PC13 as an input pin, implementing a function to 
read the button state, and testing the driver by making the LED respond to button presses. 


In the next chapter, we shall explore another important peripheral: the system tick (SysTick) timer. 
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In this chapter, we will learn about the System Tick (SysTick) timer, an important core peripheral in 
all Arm Cortex microcontrollers. We will begin by introducing the SysTick timer and discussing its 
most common use cases. Following this, we will explore the SysTick timer registers in detail. Finally, 
we will develop a driver for the SysTick timer. 


In this chapter, were going to cover the following main topics: 
e Introduction to the SysTick timer 
e Developing a driver for the SysTick timer 


By the end of this chapter, you will have a good understanding of the SysTick timer and be able to 
effectively implement and utilize it in your Arm Cortex-M projects. 


Technical requirements 


All the code examples for this chapter can be found on GitHub at 


https: //github.com/PacktPublishing/Bare-Metal -Embedded-C- Programming. 


Introduction to the SysTick timer 


The System Tick timer, commonly known as SysTick, is a fundamental component of all Arm 
Cortex microcontrollers. Regardless of the processor core—whether it’s Cortex-M0, Cortex-M1, or 
Cortex-M7—and the silicon manufacturer—be it STMicroelectronics, Texas Instruments, or any 
other—every Arm Cortex microcontroller includes a SysTick timer. In this section, we will learn 
about this essential peripheral and explore its registers in detail. 
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Overview of the SysTick timer 


The SysTick timer is a 24-bit down counter integral to all Arm Cortex-M processors. It is designed 
to offer a configurable time base that can be used for various purposes, such as task scheduling, 
system monitoring, and time tracking. This timer provides us with a simple and efficient means of 
generating periodic interrupts and serves as a cornerstone for implementing system timing functions, 
including operating system (OS) tick generation for real-time operating systems (RTOSs). Using 
SysTick makes our code more portable since it is part of the core and not a vendor-specific peripheral. 


The key features of the SysTick timer include the following: 


24-bit Reloadable Counter: The counter decrements from a specified value to zero, then reloads 
automatically to provide a continuous timing operation 


Core Integration: Being part of the core, it requires minimal configuration and offers low-latency 
interrupt handling 


Configurable Clock Source: SysTick can operate either from the core clock or an external 
reference clock, providing flexibility in timing accuracy and power consumption 


Interrupt Generation: When the counter reaches zero, it can trigger an interrupt 


The SysTick timer typically serves three primary use cases: 


e 


OS Tick Generation: In an RTOS environment, SysTick is commonly used to generate the 
system tick interrupt, which drives the OS scheduler 


Periodic Task Execution: It can be used to trigger regular tasks, such as sensor sampling or 
communication checks 


Time Delay Functions: SysTick can provide precise delays for various timing functions within 
the firmware 


Now let’s explore the registers in the SysTick timer. 


SysTick timer registers 


The SysTick timer consists of four primary registers: 


e 


e 


SysTick Control and Status Register (SYST_CSR) 
SysTick Reload Value Register (SYST_RVR) 
SysTick Current Value Register (SYST_CVR) 


SysTick Calibration Value Register (SYST_CALIB) 


Let’s analyze them one by one, starting with the Control and Status Register. 


Introduction to the SysTick timer 


The SysTick Control and Status Register (SYST_CSR) 


The SYST_CSR register controls the SysTick timer’s operation and provides status information. It 
has the following bits: 
e ENABLE (Bit 0): Enables or disables the SysTick counter 
e TICKINT (Bit 1): Enables or disables the SysTick interrupt 
e CLKSOURCE (Bit 2): Selects the clock source (0 = external reference clock, 1 = processor clock) 
e COUNTEFELAG (Bit 16): Indicates whether the counter has reached zero since the last read (1 


= yes, 0 = no) 


This is the structure of the SysTick Control and Status Register: 


31 17 16 15 3210 
COUNTFLAG | CLKSOURCE —! 
TICKINT 
ENABLE 


Figure 8.1: The SysTick Control and Status Register 


The next register is the SysTick Reload Value Register (SYST_RVR). 


The SysTick Reload Value Register (SYST_RVR) 


This register specifies the start value to load into the SysTick Current Value Register. It is crucial 
for setting the timer’s period and understanding its bit assignments and calculations is essential for 
effective SysTick configuration. 


It has the following fields: 


e Bits [31:24] Reserved: These bits are reserved 


e Bits [23:0] RELOAD: This field specifies the value that the SysTick timer will load into the 
SYST_CVR register when the counter is enabled and when it reaches zero 
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This is the structure of the SysTick Reload Value Register: 


31 24 23 0 


Figure 8.2: The SysTick Reload Value Register 


Since SysTick is a 24-bit timer, the RELOAD value can be any value in the range 0x00000001 
to OxOOFFFFFF. 


To calculate the RELOAD value based on the desired timer period, we determine the number of clock 
cycles for the desired period, and then subtract 1 from this number to get the RELOAD value. 


For example, if the core clock frequency is 16 MHz and we want the SysTick timer to trigger every 
1 ms, the RELOAD value would be calculated as follows: 


1. Calculate the number of clock cycles in 1 ms: 
Clock cycles = 16,000,000 cycles/second * 0.001 second = 16,000 cycles 
Note: 1ms = 0.001 second 

2. Subtract 1 from the calculated number of clock cycles: 

RELOAD = 16,000 - 1 = 15,999 since counting from 0 to 15,999 will give us 16000 ticks. 
Meaning, to configure the SysTick timer for a 1 ms period with a 16 MHz clock, we would set the 
RELOAD value to 15,999. The next register is the SysTick Current Value Register (SYST_CVR). 
The SysTick Current Value Register 
The SysTick Current Value Register (SYST_CVR) holds the current value of the SysTick counter. 
We can use this register to monitor the countdown process and to reset the counter when necessary. 
It has the following fields: 

e Bits [31:24] Reserved: These bits are reserved and should not be modified. They must be 
written as zero. 


e Bits [23:0] CURRENT: This field contains the current value of the SysTick counter. Reading 
this field returns the current counter value. Writing any value to this field clears it to zero and 
also clears the COUNTFLAG bit in the SysTick Control and Status Register (SYST_CSR). This 
is the SysTick Current Value Register: 
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31 24 23 0 


Figure 8.3: The SysTick Current Value Register 


The SysTick Calibration Value Register 


The final register of the SysTick timer is the SysTick Calibration Value Register (SYST_CALIB). This 
register provides us with the calibration properties of the SysTick timer. 


The names of these registers are slightly different in the STM32 header files. Table 8.1 provides a clear 
correspondence between the register names used in the Arm Generic User Guide documentation and 
those in the STM32-specific header files. This correspondence will help us understand and reference 
them correctly in our code. 


Function Arm Generic User Guide | STM32 Header File 
Control and Status | SYST_CSR SysTick->CTRL 
Reload Value SYST_RVR SysTick->LOAD 
Current Value SYST_CVR SysTick->VAL 
Calibration Value SYST_CALIB SysTick->CALIB 


Table 8.1: Correspondence of SysTick register names 


In the next section, we will use the information we have learned to develop a driver for the SysTick timer. 


Developing a driver for the SysTick timer 


In this section, we will develop a driver for the SysTick timer to generate precise delays. 


Let's start by making a copy of our last project in our IDE, following the steps we learned in Chapter 7. 
Rename the copied project to SysTick. Next, create a new file named systick.c inthe Sre 
folder and another file named syst ick.h in the Inc folder, just like we did for the GPIO drivers 


in the previous lesson. 
Populate your systick.c file with the following code: 


#include "systick.h" 


#define CTRL ENABLE (1U<<0) 
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#define CTRL CLCKSRC (1U<<2) 
#define CTRL COUNTFLAG (1U<<16) 


/*By default, the frequency of the MCU is 16Mhz*/ 
#define ONE MSEC LOAD 16000 


void systick msec delay(uint32 t delay) 


{ 


/*Load number of clock cycles per millisecond*/ 
SysTick->LOAD = ONE MSEC LOAD - 1; 


/*Clear systick current value register*/ 
SysTick->VAL = 0; 


/*Select internal clock source*/ 
SysTick->CTRL = CTRL_CLCKSRC; 


/*Enable systick* / 
SysTick->CTRL |=CTRL ENABLE; 


for(int i = 0; i < delay; i++) 


{ 


while((SysTick->CTRL & CTRL _COUNTFLAG) == 0) {} 


/*Disable systick*/ 
SysTick->CTRL = 0; 


} 
Let’s break it down. 
We start with the header file inclusion: 


#include "systick.h" 


This line includes the header file, syst ick .h, which in turn includes stm32£xx - h to provide 
access to the register definitions. 
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Next, we define all the macros we need: 


e #define CTRL ENABLE (1U << 0): Macro to enable the SysTick timer. 


e #define CTRL CLKSRC (1U << 2): Macro to select the internal clock source for the 
SysTick timer. 


e #define CTRL _COUNTFLAG (1U << 16): Macro to check the COUNTFLAG bit, which 
indicates when the timer has counted to zero. 


e #define ONE _MSEC_LOAD 16000: Macro to define the number of clock cycles in 1 
millisecond. This assumes the microcontroller’s clock frequency is 16 MHz. This is the default 
configuration of the NUCLEO-F411 development board. 


Next, we move on to the function implementation. 
First, we have the following: 
SysTick->LOAD = ONE MSEC LOAD - 1; 
This line loads the SysTick timer with the number of clock cycles for 1 millisecond. 
Then, we clear the Current Value register with the following to reset the timer: 
SysTick->VAL = 0; 
Next, we select the internal clock source: 
SysTick->CTRL = CTRL CLKSRC; 
To enable the SysTick timer, we use the following: 
SysTick->CTRL |= CTRL ENABLE; 


Now, we enter the loop that handles the delay: 


rog (slate al = Oe al < celep es) 
{ 

while ((SysTick->CTRL & CTRL _COUNTFLAG) == 0) {} 
} 


This loop runs for the specified delay duration. Inside each iteration, it waits for the COUNTFLAG bit 
to be set, which indicates the timer has counted down to zero. 


Finally, we disable the SysTick timer: 
SyS Tiek CERT OF 


And that’s it! With these steps, we've successfully implemented a delay function using the SysTick timer. 
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Our next task is to populate the syst ick. h file. 
Here is the code: 


#ifmdef SYSTICK_H_ 
#define SYSTICK_H_ 


#include <stdint.h> 
#include "stm32f4xx.h" 


void systick_msec_delay(uint32_ t delay); 


#endif 


Over here, the #include <stdint .h> directive is needed to ensure that we have access to 
standard integer type definitions provided by the C standard library. These definitions include fixed- 
width integer types such as uint32_t, int32_t,uint16_t, and so on, which are essential for 
writing portable and clear code, especially in embedded systems programming. 


With the driver files complete, we are now ready to test inside main.c. 


First, let’s enhance our gpio . c file by adding a new function that toggles the LED. This will simplify 
our code by allowing us to toggle the LED with a single function call instead of calling led_on () 
and led_off () separately. 


Add the following function to your gpio.c file: 


#define LED PIN (1U<<5) 
void led_ toggle (void) 


/*Toggle PA5*/ 
GPIOA->0DR ^ =LED_ PIN; 


} 


This function toggles the state of the LED connected to pin PA5 by using the bitwise XOR operation 
on the Output Data Register (ODR). 


Next, declare this function in the gpio . h file by adding the following line: 
void led_toggle (void) 
Finally, update your main . c file as shown here to call the SysTick delay and LED toggle functions: 


#include "gpio.h" 
#include "systick.h" 


int main(void) 


Summary 


/*Initialize LED*/ 
Lecl ote) y 


while (1){ 
/*Delay for 500ms*/ 
systick msec_delay(500); 
/* Toggle the LED */ 
iiexel ieee IENE 


} 


In this example, we are toggling the LED at a 500ms interval. Build the project and run it on your 
development board. You should see the green LED blinking. To experiment further, you can modify 
the delay value and observe how the blinking rate of the LED changes. 


Summary 


In this chapter, we explored the SysTick timer, a core peripheral of all Arm Cortex microcontrollers. We 
began with an introduction to the SysTick timer, discussing its significance and common applications, 
such as generating OS ticks in real-time operating systems, executing periodic tasks, and providing 
precise time delays. 


We then examined the SysTick timer’s registers in detail. These included the Control and Status 
Register (SYST_CSR), which manages the timer’s operation and status; the Reload Value Register 
(SYST_RVR), which sets the timer’s countdown period; the Current Value Register (SYST_CVR), 
which holds the current value of the countdown; and the Calibration Value Register (SYST_CALIB), 
which provides essential calibration properties for accurate timing. We also provided a comparison 
between the register names used in the Arm Generic User Guide and those in the STM32 header files 
to ensure clear correspondence for accurate coding. 


The chapter concluded with the development of a SysTick timer driver. We walked through the creation 
and implementation of the systick_msec_delay function, which introduces millisecond delays 
using the SysTick timer. To test the driver, we integrated it with GPIO functions to toggle our green 
LED, demonstrating how to achieve precise timing and control in embedded systems. 


In the next chapter, we shall learn about another timer peripheral. Unlike the SysTick timer, the 
configuration of this timer peripheral is specific to STM32 microcontrollers. 
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In this chapter, we will delve into the general-purpose timers (TIM) found in the STM32F411 
microcontroller. Unlike the SysTick timer, these timer peripherals are unique to STM32 microcontrollers 
and offer a range of capabilities essential for various applications. 


We will begin by discussing the common uses of timers, providing a foundation for understanding 
their importance in embedded systems. Following this, we will explore the specific characteristics 
of the timers on STM32 microcontrollers. This includes understanding the timer clock source, the 
mechanics of prescaling, and a detailed look at the commonly used timer registers. 


In the latter part of the chapter, we will put theory into practice by developing a timer driver and 
applying the knowledge we've gained to create functional and efficient code. 


In this chapter, were going to cover the following main topics: 


e Introduction to timers and their uses 
e Common use cases of timers 
e STM32 timers 


e Developing the timer driver 


By the end of this chapter, you will have a good understanding of STM32 timers and how to develop 
drivers for them, enhancing your ability to utilize these peripherals effectively in your projects. 


Technical requirements 


All the code examples for this chapter can be found on GitHub at https: //github.com/ 
Packt Publishing/Bare-Metal -Embedded-C- Programming. 


Introduction to timers and their uses 


Timers are crucial components in embedded systems, serving essential functions in a wide array of 
applications. In this section, we will explore the concept of timers, their types, and their various applications. 
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The question is, What are timers? 


Timers are hardware peripherals found in microcontrollers that count clock pulses. These pulses can 
be used to measure time intervals, generate precise delays, or trigger events at specific intervals. 
Timers can operate in several modes, including counting up, counting down, and generating Pulse 
Width Modulation (PWM) signals. 


The choice between using the SysTick timer and general-purpose timers on STM32 microcontrollers 
comes down to the complexity and precision of your timing needs. While the SysTick timer is great 
for simple system timing tasks such as generating periodic interrupts or creating delays, it’s limited 
in flexibility and features, as it’s often dedicated to real-time operating system (RTOS) ticks. On 
the other hand, general-purpose timers, as we have just discussed, offer far more versatility, with the 
ability to handle complex tasks such as PWM generation, input capture, and output compare. They 
also support multiple channels for concurrent timing events and provide advanced functionalities 
such as precise frequency measurement and low-power operation. 


Let’s explore some typical use cases to illustrate the importance of timers. 


Common use cases of timers 


Let’s start with timers for time interval measurement. 


Time interval measurement 


One common use of timers is in ultrasonic sensors for distance measurement, widely used in robotics, 
automotive parking systems, and obstacle detection. 


Timers play a vital role in the operation of these sensors. Here’s how they work: 
1. The microcontroller sends a trigger signal to the ultrasonic sensor, causing it to emit an 
ultrasonic pulse. 
2. The sensor waits for the echo of the pulse to return after bouncing off an object. 
3. A timer starts counting when the pulse is emitted and stops when the echo is received. 
4. The time interval measured by the timer is used to calculate the distance to the object based 


on the speed of sound. 


By accurately measuring the time interval between sending the pulse and receiving the echo, the 
system can determine the distance to objects, enabling precise navigation and obstacle avoidance. 


Timers are also essential for generating precise delays, such as in the refresh mechanism of light- 
emitting diode (LED) matrix displays. 
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Delay generation 


In embedded systems, LED matrix displays are used for various applications, including digital signage, 
scoreboards, and simple graphical displays. To display images or text, the microcontroller needs to 
refresh the display at a precise rate to ensure smooth visuals. 


Over here, the following happens: 


e The microcontroller drives an LED matrix display and needs to refresh the display rows sequentially. 


e A timer generates precise delays to control the time each row is activated before moving to 
the next row. 


For example, the timer might be set to generate a delay of 1 millisecond between switching rows. 


This ensures that each row is displayed for a consistent period, maintaining a stable and 
flicker-free image. 


Now, let’s see an example of how timers can be used for triggering events in embedded systems. 


Event trigger 


Periodic data sampling from sensors is important for applications such as environmental monitoring 
and industrial process control (IPC). A timer can be used to trigger the analog-to-digital converter 
(ADC) at specific intervals to ensure consistent data sampling. 


Over here, the following happens: 


e The microcontroller is connected to an environmental sensor (for example, a temperature or 
humidity sensor) that outputs an analog signal 


e A general-purpose timer is configured to trigger the ADC conversion periodically (for example, 
every 100 milliseconds) 


e The ADC is set up to use the timer trigger option, automatically starting a conversion at each 
timer event 


e The main loop of the microcontroller regularly checks for completed ADC conversions and 
processes the data 


In the next section, we will focus on the features and characteristics of STM32 timers. 


STM32 timers 


The timers in STM32 microcontrollers are classified into two main categories: general-purpose timers 
and advanced timers. 
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Introduction to general-purpose timers and advanced timers 


General-purpose timers are highly versatile and can be used for a variety of applications, whereas 


advanced timers offer more sophisticated features than general-purpose timers, making them suitable 
for high-precision and complex timing tasks. 


In the STM32F411 microcontroller, timers TIM2, TIM3, TIM4, TIM5, TIM9, TIM10, and TIM11 
are general-purpose timers, whereas TIM] is an advanced timer. 


Key features of the general-purpose timers include the following: 


Counter size: They feature 16-bit counters for TIM3 and TIM4 and 32-bit counters for TIM2 
and TIM5, capable of operating in up, down, or up/down auto-reload modes 


Prescaler: They are equipped with a 16-bit programmable prescaler, which allows us to divide 
the counter-clock frequency by any factor ranging from 1 to 65,536 


Channels: They offer up to four independent channels 


Synchronization: The timers can synchronize with external signals and interconnect with 
multiple timers, enhancing flexibility and coordination in complex applications 


Interrupt/direct memory access (DMA) generation: The timers can generate interrupts 
or DMA requests based on various events, including counter overflow/underflow, counter 
initialization, trigger events, input capture, and output compare 


Encoder support: These timers also support incremental (quadrature) encoders and Hall-sensor 
circuitry, making them suitable for precise positioning applications 


The advanced timer, TIM1, has a 16-bit counter size. Apart from the difference in counter size, it shares 
all the features of the general-purpose timers with the following additional features: 


Complementary outputs: Support for complementary outputs with programmable 
dead-time insertion 


Repetition counter: Allows updating the timer registers only after a specific number of 
counter cycles 


Break input: It has the ability to put the timer’s output signals into a reset or known state 
upon activation 


Before we start analyzing some of the key registers provided in the reference manual, let’s first 
understand how STM32 timers work. 
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How STM32 timers work 


The core of the timer is the 16-bit/32-bit counter and its associated auto-reload register. The counter 
can count upward or downward, and its clock can be divided by a prescaler. We can read from and 
write to the counter, auto-reload register, and prescaler register even while the counter is running. 


The time-base unit includes the following registers: 


e Counter register (TIMx_CNT) 
e Prescaler register (TIMx_PSC) 


e Auto-reload register (TIMx_ARR) 


The prescaler register (TIMx_PSC) divides the timer’s input clock frequency by a programmable value 
between 1 and 65,536. This allows for slower counting rates to accommodate longer timing intervals. 


The auto-reload register (TIMx_ARR) defines the value at which the counter resets to zero in up-counting 
mode or to the auto-reload value in down-counting mode. We use it to set the period of the timer. 


Apart from the registers in the time-base unit, the timer peripheral also includes control registers 
(TIMx_CR1, TIMx_CR2, and so on) for configuring the various operational parameters, such as 
enabling the counter, setting the counting direction, and configuring update events (UEVs). A status 
register (SR) (TIMx_SR) indicates the status of the timer, such as whether a UEV has occurred and 
other auxiliary registers for controlling the timer, among other things. 


As we mentioned earlier, the timer has two counting modes: up-counting mode and down-counting 
mode. Let’s break them down: 


e Up-counting mode: 

The counter counts from 0 to the auto-reload value (TIMx_ARR) and generates a UEV on overflow 
e Down-counting mode: 

The counter counts down from the auto-reload value to 0 and generates a UEV on underflow 


The next question that arises is: What exactly is a UEV? 


Let’s break this down: 


e In simple terms, it is when the timeout should occur 
e When the counter reaches the auto-reload value or 0 (based on the count mode), a UEV occurs 


e This event can trigger an interrupt or a DMA request and set the update interrupt flag (UIF) 
in the SR (TIMx_ SR) 


e UEVs can also be generated manually by setting the UG bit in the event generation register 
(TIMx EGR) 
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Now that we know what a UEV is, let’s understand how timer clock prescaling is achieved; this will 
enable us to accurately calculate the timing for UEVs. 


Timer clock prescaling 


We use prescaling to reduce the high-frequency system clock to a lower frequency suitable for driving 
the timer. Figure 9.1 illustrates the process of timer clock pre-scaling in our STM32 microcontroller, 
showing how the system clock (SYSCLK) is divided down to drive the timer counters (TIMx_CNT) 
through a series of prescalers: 


elas — TIMx_PSC —> TIM_CNT 
Prescaler a 
Divide by: Divide by: 
1,2,4,8...16 1,2,4,8...65535 
SYSCLK 
= ANB —| HCLK 
Prescaler 
Divide by: Divide by: 
1,2,4,8...512 1,2,4,8...256 
nee — TIMx_PSC — TIM_CNT 
Prescaler = 
Divide by: Divide by: 
1,2,4,8...16 1,2,4,8...65535 


Figure 9.1: Timer clock prescaling 


Let’s explain each component and step involved in this process: 


e System clock (SYSCLK) 


This is the main system clock that drives the microcontroller. It serves as the initial clock source 
for all subsequent operations. 


e Advanced High-performance Bus (AHB) prescaler 


This is the first prescaler in the chain, which divides the SYSCLK frequency by a factor of 1, 
2, 4, 8, 16, 64, 128, 256, or 512. The resulting clock signal is referred to as HCLK (AHB clock). 


e High-performance Bus Clock (HCLK) 


This is the clock signal generated after the AHB prescaler. It is used to drive the core and the 
system bus. 
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e Advanced Peripheral Bus (APB) prescalers 


APBI prescaler: This further divides HCLK to generate the clock for the APB1 peripheral 
bus. The division factors available are 1, 2, 4, 8, and 16. 
APB2 prescaler: Similar to the APB1 prescaler but used for the APB2 peripheral bus. It also 
divides HCLK by factors of 1, 2, 4, 8, and 16. 

e Timer prescaler (TIMx_PSC) 


Each timer has its own prescaler register that can divide the APB1 or APB2 clock further by 
any value between 1 and 65,535. This flexibility allows for precise control over the timer’s 
counting rate. 


¢ Timer counter (TIMx_CNT) 


This is the final counter that uses the clock derived from TIMx_PSC. The rate at which this counter 
increments or decrements depends on the combined division effect of the previous prescalers. 


The prescalers directly influence the rate at which the timer counter (TIMx_CNT) increments or 
decrements. While the other prescalers (AHB and APB) are shared among all peripherals on the same 
bus, the TIMx_PSC prescaler is unique to each timer. By adjusting the TIMx_PSC prescaler, we can 
precisely control the resolution and period of that specific timer without affecting other peripherals. 


Now, let’s learn how to compute a UEV. 
Computing a UEV 


A UEV occurs when the timer counter reaches the value set in the auto-reload register (TIMx_ARR) 
in up-counting mode. Calculating this event is crucial for determining the timer’s period and ensuring 
it operates at the desired frequency. 


We can derive the frequency of the UEV with the following formula: 


pae Timer Clock 
Update Event Frequency ~ (Prescaler + 1) X (Period + 1) 


Here, we have the following: 


e Timer clock: The clock frequency driving the timer (APB1 or APB2). The APB1 and APB2 bus 
frequencies are equal to SYSCLK when using the default clock configuration (in a bare-metal 
setup) on the NUCLEO-F411 development board. 


e Prescaler: The value set in the TIMx_PSC register. 


e Period: The value set in the TIMx_ARR register. 
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Let’s see an example. Let’s say we have the following parameters: 
e Timer clock (APB1 clock): 16 MHz 
e Prescaler (TIMx_PSC value): 15999 
e Period (TIMx_ARR value): 499 

This means we have the following values: 


e The timer clock is 16,000,000 Hz 


e The TIMx_PSC register is set to 15999, but since the prescaler divides the clock by the value 
plus one, we use 15999 + 1 = 16000 


e The TIMx_ARR register is set to 499, but since the period counts to the value plus one, we 
use 499 + 1 = 500 


Plugging these values into our formula, we get the following: 


= —___16,000,000__ 
Update Event Frequency = (15999 +1) X 4994 1) 
— 16,000,000 
8000000 
= 2Hz 


This means the UEV occurs at a frequency of 2 Hz, or twice every second. This concludes this section. 
In the next section, we will develop our timer driver. 


Developing the timer driver 


In this section, we will apply the knowledge gained about the TIM peripheral to develop a driver for 
generating delays. 


First, create a copy of your previous project in your IDE, following the steps outlined in earlier 
chapters. Rename this copied project GTIM. Next, create a new file named tim. c in the Src folder 
and another file named tim. h in the Inc folder. 


Our goal is to develop a driver that initializes TIM2 to generate a 1 Hz timeout. Populate your tim.c 
file with the following code: 


#include "tim.h" 
#define TIM2EN (1U<<0) 
#define CR1_ CEN (1U<<0) 


void tim2_lhz init (void) 


{ 


/*Enable clock access to tim2*/ 
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RCC->APB1ENR |=TIM2EN; 


/*Set prescaler value*/ 


TIM2->PSC = 1600 - 1 ; 
/*Set auto-reload value*/ 
TIM2->ARR = 10000 - 1; 


/*Clear counter*/ 
TIM2->CNT = 0; 


/*Enable timer*/ 
MoM2— -CRIE CRIM ecEN; 


} 

Let’s break it down: 
#define TIM2EN (1U<<0) 
#define CRITCEN (1U<<0) 


The TIM2EN instance is defined as (1U<<0) , which sets bit 0. This is used to enable the clock for TIM2. 


The CR1_CEN instance is defined as (1U<<0) , which also sets bit 0. This is used to enable the counter 
in the TIM2 control register 1 (CR1): 


/* Enable clock access to TIM2 */ 
RCC->APB1ENR |= TIM2EN; 


This line enables the clock for TIM2 by setting the appropriate bit in the APB1 peripheral clock enable 
register (RCC- >APB1ENR): 


/* Set prescaler value */ 
TIM2->PSC = 1600 - 1; Hi AS pOOO, 000 // 1-600 = allo), 000 


The prescaler value is set to 1599. The prescaler divides the input clock frequency (16 MHz) by (1599 
+ 1), resulting in a 10,000 Hz (10 kHz) timer clock: 


/* Set auto-reload value */ 
TIM2->ARR = 10000 - 1; 


The auto-reload value is set to 9999. This means the timer will count from 0 to 9999, creating a period 
of 10,000 ticks. Since the timer clock is 10 kHz, counting 10,000 ticks results in 1 second (10,000 / 
10,000 Hz = 1s): 


/* Clear counter */ 
TIMEZ CNTEE (0)5 
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This line resets the timer counter to 0. It ensures that the counting starts from 0 when the timer is enabled: 


/* Enable timer */ 
TIM2->CR1 = CR1_ CEN; 


This line enables the timer by setting the CEN (Counter Enable) bit in the TIM2 control register 1 (CR1). 
This starts the timer, and it begins counting based on the configured prescaler and auto-reload values. 


In summary, our code accomplishes the following: 
1. Enables the clock for TIM2. 
2. Sets a prescaler value to divide the input clock to 10 kHz. 
3. Sets an auto-reload value to make the timer count up to 10,000, creating a 1-second period. 
4. Clears the timer counter. 
5. Enables the timer. 
Our next task is to populate the tim. h file. 


Here is the code: 


#inelude “stm32£4soc. h" 
#ifndef TIM H_ 
#define TIM _H_ 


#define SIR WIE (1U<<0) 
void tim2_lhz MENi 


#endif 


The following line defines a SR_UIF macro that sets the Oth bit (least significant bit) to 1: 


#define SR UIE (1U << 0) 


This bit represents the UIF in the SR of the timer. When the timer overflows or reaches the auto-reload 
value, this flag is set, indicating a UEV. We will need to access this bit in the main . file. 


We are now ready to test inside main.c. 
Update your main . c file as shown next: 


#include "gpio.h" 
tine iude mucim ny 


int main(void) 


{ 


/*Initialize LED*/ 


Summary 


Gel _sisaulic (() 5 
/*Initialize timer*/ 
(ean) alla, alsatalic (()) 9 
while (1) 
{ 
eel icosielle () F 
/*Wait for UIF */ 
while(!(TIM2->SR & SR_UIF)) {} 
/*Clear UIF*/ 
TIM2->SR &=~SR_UIF; 


} 


In this code, we initialize the LED and TIM2 to toggle the LED at a 1 Hz frequency. The code works 
by waiting for the timer’s UIF to be set, indicating that 1 second has passed, then toggling the LED 
and clearing the UIF to repeat the process. This cycle continues indefinitely in the main loop, resulting 
in the LED toggling every second. 


Summary 


In this chapter, we explored the general-purpose timers (TIM) in the STM32F411 microcontroller, 
which are distinct from the SysTick timer and offer a variety of features crucial for embedded systems 
applications. We began by discussing the fundamental uses of timers, emphasizing their importance 
in tasks such as time interval measurement, delay generation, and event triggering. 


We then learned about the specifics of STM32 timers, detailing their classification into general-purpose 
and advanced timers. 


Next, we examined the workings of STM32 timers, focusing on key registers such as the counter register 
(TIMx_CNT), prescaler register (TIMx_PSC), and auto-reload register (TIMx_ARR). We explained 
how the timer’s clock prescaling mechanism reduces the system clock to a suitable frequency for the 
timer and how this affects the timer’s operation. 


We also provided a practical example of computing UEV frequency, demonstrating how to calculate 
the period and frequency of the timer based on the prescaler and auto-reload values. 


Finally, we applied the theoretical knowledge by developing a timer driver for TIM2 to generate a 1 
Hz timeout. In the next chapter, we will learn about another useful peripheral. 
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The Universal Asynchronous 
Receiver/Transmitter Protocol 


In this chapter, we will learn about the universal asynchronous receiver/transmitter (UART) protocol, 
an important communication method widely used in embedded systems. UART is fundamental for 
enabling communication between microcontrollers and various peripherals, making it an essential 
component in embedded systems development. 


We will start by discussing the significance of communication protocols in embedded systems and 
highlight common use cases for UART alongside other protocols such as SPI and I2C. Following this, 
we will provide a comprehensive overview of the UART protocol, detailing its operational principles 
and features. Next, we will extract and examine the relevant registers for UART from the STM32 
reference manual, providing the necessary foundational knowledge for driver development. Finally, 
we will apply this knowledge to develop a bare-metal UART driver, illustrating the practical aspects 
of initializing and transmitting data via UART. 


In this chapter, we will cover the following main topics: 
e Introduction to communication protocols 
e Overview of the UART protocol 
e The STM32F4 UART peripheral 
e Developing the UART driver 


By the end of this chapter, you will have a good understanding of the UART protocol and the skills 
needed to develop bare-metal drivers for UART communication. 


Technical requirements 


All the code examples for this chapter can be found on GitHub at https: //github.com/ 
Packt Publishing/Bare-Metal -Embedded-C- Programming. 
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Introduction to communication protocols 


In the world of embedded systems, communication protocols are essential conduits that enable 
microcontrollers and peripheral devices to talk to each other seamlessly. Think of them as the languages 
that different devices use to understand and exchange information, ensuring that everything from 
your smartphone to your smart home devices works smoothly. 


Let’s dive into what communication protocols are, how they are grouped, their unique features and 
advantages, and explore some common use cases to see these protocols in action. 


What are communication protocols? 


Communication protocols are sets of rules and conventions that allow electronic devices to communicate 
with each other. These protocols define how data is formatted, transmitted, and received, ensuring 
that devices can exchange information accurately and reliably. Without these protocols, it would 
be like trying to have a conversation with someone who speaks a completely different language - 
communication would be chaotic and error-prone. 


In embedded systems, these protocols are crucial because they facilitate the interaction between 
microcontrollers and peripherals such as sensors, actuators, displays, and other microcontrollers. 
Whether it’s sending a simple temperature reading from a sensor to a microcontroller or streaming 
video data from a camera module, communication protocols make it happen. 


Let’s analyze the classification of communication protocols, starting with the big picture: what 
communication protocols can be broadly classified into - serial and parallel communication. 


Serial versus parallel communication 


Let’s start with serial communication. 


Serial communication 


In this category, communication protocols can be further broken down into asynchronous and 
synchronous protocols: 


« Asynchronous: This type of communication sends data one bit at a time without a clock 
signal to synchronize the sender and receiver. Think of it as sending letters through the mail 
without a scheduled delivery time. A common example is UART, which is simple and efficient 
for many applications. 


e Synchronous: Unlike asynchronous communication, this form of communication uses a clock 
signal to coordinate the transmission of bits. It’s like having a drumbeat to ensure everyone 
marches in step. Examples include Serial Peripheral Interface (SPI) and Inter-Integrated 
Circuit (I2C). These protocols ensure data integrity and timing, making them suitable for 
more complex tasks. 
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Parallel communication 


This type involves transmitting multiple bits simultaneously over multiple channels. Imagine sending 
a whole fleet of cars instead of a single one — it’s faster but requires more lanes (or pins, in our case). 
While parallel communication is faster, it’s less common in embedded systems due to the higher pin 
count. Also, it’s prone to crosstalk and signal integrity problems, especially over longer distances. 


We can also classify communication protocols based on their architecture. In this classification system, 
we have point-to-point communication and multi-device communication. 


Point-to-point versus multi-device communication 


Let’s look at the differences. 


Point-to-point 


This is a direct line of communication between two devices. UART is a classic example, where data 
flows directly between a microcontroller and a peripheral device. It’s straightforward, reliable, and 
ideal for many embedded systems. 


Multi-device (bus) communication 


Here, multiple devices share the same communication lines, which can be either of the following: 


e Multi-master: Multiple devices can control the communication bus. I2C is a great example as 
it allows multiple masters and slaves on the same bus. It’s like a group of friends taking turns 
talking in a conversation. 


e Master-slave: One master device controls the communication, directing traffic to and from 
multiple slave devices. SPI operates this way, with a single master communicating with multiple 
slaves through dedicated lines. I2C can also operate this way. It’s akin to a teacher (master) 
calling on students one at a time to speak. 


Lastly, communication protocols can be classified based on their data flow capabilities. 


Full-duplex versus half-duplex 
Let’s see the differences between full-duplex and half-duplex: 


e Full-duplex: This allows simultaneous two-way communication. Imagine a two-lane road 
where cars can travel in both directions at the same time. UART and SPI support full-duplex 
communication, making them highly efficient for real-time data exchange. 


e Half-duplex: Here, communication can occur in both directions, but not at the same time - 
it’s like a single-lane road where cars must take turns. I2C typically operates in half-duplex 
mode, which works well for its intended applications but can be a limitation in high-speed 
data scenarios. 
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Now, let’s closely compare the three common communication protocols that are used in modern 
embedded systems. 


Comparing UART, SPI, and I2C 


Let’s start with UART. 


UART 


Here are some key features of UART: 


Asynchronous communication: UART doesn't require a clock signal. Instead, it uses start and 
stop bits to synchronize data transmission. 


Full-duplex: UART can send and receive data simultaneously, which is ideal for many applications 
requiring real-time communication. 


Simple and cost-effective: With minimal hardware requirements, UART is easy to implement 
and cost-effective. 


The following are some of its advantages: 


Ease of use: Setting up UART communication is straightforward, making it a popular choice 
for beginners and simple applications 


Wide support: UART is universally supported by most microcontrollers and peripheral devices 


Low overhead: The lack of a clock signal means fewer pins are used, reducing complexity 


However, it also has some disadvantages: 


Speed limitations: UART is generally slower compared to SPI and I2C, making it less suitable 
for high-speed data transfer 


Limited distance: Susceptibility to noise over long distances can limit the range of 
reliable communication 


Point-to-point only: UART is designed for direct, point-to-point communication, which can 
be a limitation if multiple devices need to communicate 


Next, we have SPI. 
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SPI 


Here are some key features of SPI: 


Synchronous communication: SPI uses a clock signal along with data lines, ensuring synchronized 
data transfer 


Full-duplex: It allows data to be sent and received simultaneously 


Master-slave architecture: One master device controls multiple slave devices, with dedicated 
lines for each 


The following are some of its advantages: 


High speed: SPI supports high-speed data transfer, making it ideal for applications requiring 
fast communication 


Versatility: SPI can connect multiple devices with different configurations, providing flexibility 
in design 


However, it also has some disadvantages: 


More pins required: Each slave device needs a separate select line, which can increase the pin 
count significantly 


No standardized acknowledgment: Unlike I2C, SPI does not have a built-in acknowledgment 
mechanism, which can make error detection more challenging 


Limited multi-master capability: SPI is not designed for multi-master systems, which can be 
a limitation in some scenarios 


The final common communication protocol we'll cover is I2C. 


12C 


Here are some key features of I2C: 


Synchronous communication: I2C uses a clock signal for synchronized data transfer 


Multi-master capability: Multiple master devices can share the same bus, which is useful in 
more complex systems 


Two-wire interface: I2C requires only two lines (SDA and SCL) for communication, minimizing 
the pin count 
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The following are some of its advantages: 


e Simplicity in wiring: The two-wire interface reduces the complexity and number of pins required 


e Multi-device support: I2C easily connects multiple devices on the same bus, each with a 
unique address 


e Built-in addressing: I2C has a built-in addressing mechanism, making communication with 
multiple devices straightforward 


However, it does have some disadvantages: 


e Slower speed: I2C is generally slower than SPI, which can be a limitation for high-speed applications 


e Complex protocol: The protocol is more complex than UART and SPI, requiring more 
sophisticated handling of data transfers and addressing 


e Susceptible to noise: Like UART, I2C can be susceptible to noise over longer distances, 
potentially affecting communication reliability 


Choosing the right communication protocol depends on your specific application needs. If you need 
simple, straightforward communication and can tolerate slower speeds, UART is a great choice. For 
high-speed applications with a need for full-duplex communication, SPI is ideal, especially if you 
can manage the higher pin count. When you need to connect multiple devices with minimal wiring 
and have a complex communication setup, I2C is your go-to protocol. To help you better understand 
when to choose which protocol, let’s explore some common use cases. 


Common use cases for the UART, SPI, and I2C protocols 


When designing embedded systems, selecting the right communication protocol is crucial for ensuring 
efficient and reliable data exchange. UART, SPI, and I2C each have unique strengths, making them 
suitable for different applications. Let’s explore the practical use cases and compelling case studies for 
each protocol, highlighting their professional and real-world relevance. 


UART 
Let’s look at some common use cases for the UART protocol: 


e Serial communication with PCs: UART is often used for serial communication between 
microcontrollers and computers, particularly for debugging, firmware updates, and data logging 


e GPS modules: UART can be used to transmit location data from a GPS module to a microcontroller 


e Bluetooth modules: UART enables wireless communication with devices via Bluetooth 


These use cases represent some of the most common applications of UART, but the protocol is versatile 
and can be used in many other scenarios that require simple serial communication. 
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Case study - GPS module integration for autonomous drones 

Imagine you're developing an autonomous drone that requires precise navigation to perform 
tasks such as surveying and mapping. Integrating a GPS module using UART can provide 
real-time location data essential for navigation. 


Setup: Connect the GPS module's transmit (TX) pin to the microcontroller’s receive (RX) pin 
and vice versa. Configure the baud rate so that it matches the GPS module’s output. 


Operation: The GPS module continuously sends NMEA sentences (text strings) containing 
location data. The microcontroller reads these strings via UART, parses them, and uses the 
location information to navigate the drone accurately. 


Advantage: UART’s simplicity and widespread support make it straightforward to integrate the 
GPS module, providing reliable and continuous data flow without a complex setup. 


Next, well look at SPI. 


SPI 


The following are some common use cases for the SPI protocol: 


Like UART, these examples highlight some typical uses of SPI, but the protocol’s high-speed capabilities 


e High-speed data transfer: It’s ideal for applications such as memory cards, analog-to-digital 


converters (ADCs), digital-to-analog converters (DACs), and displays 


e Display modules: SPI can be used for communicating with high-resolution displays requiring 


fast refresh rates 


e Sensors and actuators: SPI can handle high-frequency data outputs from various sensors 


make it suitable for a wide range of other applications requiring rapid data transfer. 


cr 


Case study - SD card data logging for industrial equipment 
Consider an industrial monitoring system that logs data from various sensors to an SD card 
for long-term analysis. SPI is the perfect protocol for this high-speed data transfer. 


Setup: Connect the microcontroller to the SD card using SPI pins (MISO, MOSI, SCLK, and 
CS). Initialize the SPI bus and configure the SD card. 


Operation: The microcontroller collects data from sensors (for example, temperature, pressure, 
and vibration) and writes this data to the SD card in real time. 


Advantage: SPI’s high-speed data transfer ensures that large amounts of data are logged quickly 
and efficiently, preventing any data loss and ensuring accurate monitoring. 


Using SPI in this scenario allows the industrial system to maintain precise logs of critical parameters, 


which are essential for predictive maintenance and operational efficiency. 
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Finally, we have I2C. 
12C 
Let’s consider two common use cases related to I2C: 


e Multiple sensor integration systems: This involves connecting several sensors with different 
addresses on the same I2C bus 


e Peripheral expansion: This involves adding more GPIO pins to a microcontroller using 
I2C expanders 


These use cases are just two examples of I2C’s applications. Its ability to support multiple devices 
on a single bus makes it an excellent choice for many other scenarios where scalability is important. 


Case study - environmental monitoring system for smart agriculture 

Let’s say you're developing a smart agriculture system that uses multiple sensors (temperature, 
humidity, and soil moisture) to optimize farming conditions. I2C is the ideal protocol for this 
multi-sensor integration. 


Setup: Connect all sensors to the I2C bus (SDA and SCL lines). Assign each sensor a unique address. 


Operation: The microcontroller queries each sensor in sequence, collects the data, and processes 
it to provide insights and control irrigation, ventilation, and lighting systems. 


Advantage: I2C’s ability to support multiple devices on the same bus with just two lines 
simplifies wiring, reduces costs, and saves GPIO pins, making it an efficient solution for 
complex sensor networks. 


Starting with the next section, we'll focus exclusively on the UART protocol. We'll cover the I2C and 
SPI protocols in the following chapters. 


Overview of the UART protocol 


One of the most fundamental and widely used protocols is UART. Whether you're debugging hardware 
or enabling communication between a microcontroller and peripherals, understanding UART is 
crucial. Let’s delve into the workings of this protocol. 


What is UART? 


UART is a hardware communication protocol that operates using asynchronous serial communication, 
allowing for adjustable data transmission speeds. The “asynchronous” nature of UART means it doesn't 
require a clock signal to align the transmission of bits between the sender and receiver. Instead, both 
devices must agree on a specific baud rate, which dictates the speed at which data is exchanged. Let’s 
take a look at the interface. 
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The interface 


The UART interface employs two wires for communication: TX and RX. To establish a connection 
between two devices, we simply connect the TX pin of the first device to the RX pin of the second 
device, and the RX pin of the first device to the TX pin of the second device. Additionally, it’s crucial 
to connect the ground pins of both devices to ensure a common electrical reference. Figure 10.1 shows 
the connection between two UART devices: 


Ea 


UART UART 
DEVICE 1 DEVICE 2 


Figure 10.1: The UART interface 


How UART works 


Data in UART is transmitted as frames containing a start bit, data bits, an optional parity bit, and 
stop bits: 


Start 
Bit 


Data Frame 


1 bit 5 to 9 bits 1bit 1 or 2 bits 


Figure 10.2: The UART data packet 


Here’s a step-by-step breakdown of the process: 
1. Start bit: The transmission line is normally held high. To start the data transfer, the transmitting 
UART pulls the line low for one clock cycle. This indicates the start of a new data frame. 


2. Data frame: Following the start bit, the data frame typically consists of 5 to 9 bits and is sent 
from the least significant bit (LSB) to the most significant bit (MSB). 


198 


The Universal Asynchronous Receiver/Transmitter Protocol 


3. Parity bit: This is optional and is used for error checking. It ensures that the number of set bits 
(1s) in the data is even or odd. 


4. Stop bits: This is one or two bits indicating the end of the data packet. The line is driven high 
during the stop bits. 


Let’s take a closer look at the start, stop, and parity bits. 
The start, stop, and parity bits 


These bits form the backbone of the UART protocol, allowing devices to synchronize and verify the 
integrity of the transmitted data. 


Start bit 


The start bit is the initial signal that marks the beginning of a data frame in UART communication. 
When the transmitting device is idle, the data line is held at a high voltage level (logic 1). To signal 
the start of transmission, the UART transmitter pulls the line to a low voltage level (logic 0) for a 
1-bit duration. This transition from high to low alerts the receiving device that a new data packet is 
incoming, allowing it to synchronize and prepare for data reception. 


Stop bit 


After the data bits and optional parity bit are transmitted, the stop bit signals the end of the data 
frame. The transmitter drives the data line back to a high voltage level (logic 1) for 1 or 2-bit durations, 
depending on the configuration. The stop bit(s) ensure that the receiver has time to process the last 
data bit and prepare for the next start bit. In essence, the stop bit acts as a buffer, providing a clear 
demarcation between successive data frames and helping maintain synchronization between the 
communicating devices. 


Parity bit 


The parity bit is an optional feature that’s used for basic error checking in UART communication. 
It provides a simple method to detect errors that may have occurred during data transmission. The 
parity bit can be configured for either even or odd parity: 


e Even parity: The parity bit is set to 0 if the number of 1s in the data frame is even, and set to 
1 if the number of 1s is odd. This ensures that the total number of 1s (including the parity bit) 
is even. 


e Odd parity: The parity bit is set to 0 if the number of 1s in the data frame is odd, and set to 1 if 
the number of 1s is even. This ensures that the total number of 1s (including the parity bit) is odd. 


When the receiver gets the data frame, it checks the parity bit against the received data bits. If there's a 
mismatch, it indicates that an error occurred during transmission. While parity doesn’t correct errors, 
it helps in identifying them, prompting for retransmission if necessary. 
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The start, stop, and parity bits are essential components of UART communication, each playing a critical 
role in ensuring data integrity and synchronization. The start bit signals the beginning of transmission, 
the stop bit marks the end, and the parity bit provides a basic error-checking mechanism. Together, 
they create a robust framework for reliable and efficient serial communication between devices. 


Before wrapping up this section, let’s take a moment to understand the unit of speed that’s used in 
UART communication. 


Understanding the baud rate - the speed of communication in embedded 
systems 


In the world of embedded systems, baud rate is a term you'll encounter frequently. Whether you're 
debugging a microcontroller, setting up a serial communication link, or working with various 
peripherals, understanding the baud rate is essential. But what exactly is the baud rate, and why is it 
so important? Let’s break it down. 


What is the baud rate? 


The baud rate is essentially the speed at which data is transmitted over a communication channel. It’s 
measured in bits per second (bps). Think of it as the speed limit on a highway: the higher the baud 
rate, the more data can travel along the communication path in a given amount of time. 


For example, a baud rate of 9,600 means 9,600 bits of data are transmitted each second. In other words, 
it sets the pace for how fast data packets are sent and received. 


However, it’s important to distinguish between the baud rate and the bit rate. While the baud rate 
refers to the number of signal changes per second, the bit rate is the number of bits transmitted per 
second. In simple systems, each signal change can represent one bit, making the baud rate and bit rate 
the same. In more complex systems, each signal change can represent multiple bits, resulting in a bit 
rate higher than the baud rate. 


Why does the baud rate matter? 


Imagine trying to have a conversation with someone who speaks at a wildly different speed than 
you. It would be confusing and inefficient, right? The same principle applies to electronic devices 
communicating with each other. Both the transmitting and receiving devices need to agree on a 
common baud rate to understand each other correctly. If they don't, the data might get lost or garbled, 
leading to communication errors. 


For successful communication, both the sender and receiver must have the same baud rate to 
synchronize correctly. If one device is set to 9,600 bps and the other to 115,200 bps, the communication 
will fail, similar to how a conversation fails if one person is speaking too fast or too slow for the other 
to understand. 
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There are standard baud rates that are commonly used in serial communication. Here are a few: 
e 300 bps: Very slow, often used for long-distance communication where bandwidth is limited 
e 9,600 bps: A widely used default rate for many devices, including microcontrollers 
e 19,200 bps: Faster, often used in more data-intensive applications 
e 115,200 bps: High-speed communication, common in applications requiring quick data transfer 
This concludes our overview of the UART protocol. In the next section, we will explore the UART 


peripheral in the STM32F4 microcontroller. 


The STM32F4 UART peripheral 


STM32 microcontrollers often include several UART peripherals, though the number varies depending 
on the specific model. The STM32F411 microcontroller has three UART peripherals: 


e USART1 
e USART2 
e USART6 
USART versus UART 


Our STM32 documentation refers to the UART peripheral as USART because it stands 
for universal synchronous/asynchronous receiver/transmitter. This name reflects the dual 
functionality of the peripheral: 


Asynchronous mode (UART): In this mode, the USART operates as a traditional UART. 
It transmits and receives data without needing a clock signal, which is typical for standard 
serial communication. 


Synchronous mode (USART): In this mode, the USART can also operate with a synchronous 
clock signal, allowing it to communicate with devices that require a clock line in addition to 
the data lines. 


Let’s analyze the key registers of this peripheral, starting with the USART Status Register. 
USART Status Register (USART_SR) 


The USART_SR register is one of the main registers used to monitor the status of the UART peripheral. 
It provides real-time information about various operational flags and errors. 
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Let’s consider the key bits in this register: 


Transmit data register empty (TXE): This bit is set when the data register is empty and ready 
for new data to be written. It indicates that the transmitter can send more data. 


Read data register not empty (RXNE): This bit indicates that the data register contains data 
that has not been read yet. It signals that there is incoming data to be processed. 


Transmission complete (TC): This bit is set when the last transmission has been completed, 
including all the stop bits. It shows that the data has been fully sent. 


Overrun error (ORE): This bit indicates that the data was lost because the data register wasn’t 
read before new data arrived. It flags an error condition. 


You can find detailed information about this register on page 547 of the STM32F411 reference manual 
(RM0383). Next, we have the USART Data Register (USART_DR). 


USART Data Register (USART_DR) 


The USART_DR register is used for both transmitting and receiving data. It acts as the primary interface 
for data exchange through the UART peripheral. 


The following are the key functions in this register: 


Transmitting data: Writing a byte to USART_DR sends the data through the TX line. The 
UART peripheral handles the conversion and transmission serially. 


Receiving data: Reading from USART_DR retrieves the data received on the RX line. This 
should be done promptly to avoid data overrun. 


Next, we have the USART Baud Rate Register (USART_BRR). 


USART Baud Rate Register (USART_BRR) 


The USART_BRR register is used to set the baud rate for the UART communication, which is critical 
for synchronizing the data transfer speed between devices. 


This register has two fields: 


Mantissa: The integer part of the division factor that sets the baud rate 


Fraction: The fractional part of the division factor that fine-tunes the baud rate 


The final register we will examine is the USART Control Register 1 (USART_CR1). 
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USART Control Register 1 (USART_CR1) 


The USART_CR1 register is a comprehensive control register that enables various UART functionalities 
and configurations. 


Let’s consider the key bits in this register: 


e USART enable (UE): This bit enables or disables the UART peripheral. It must be set to activate 
UART communication. 


e Word length (M): This bit configures the word length, allowing 8-bit or 9-bit data frames. 
e Parity control enable (PCE): This bit enables parity checking for error detection. 

e Parity selection (PS): This bit selects even or odd parity. 

e Transmitter enable (TE): This bit enables the transmitter, allowing data to be sent. 


e Receiver enable (RE): This bit enables the receiver, allowing data to be received. 


With these registers in mind, were now ready to develop the UART driver. We will dive into that in 
the next section. 


Developing the UART driver 


In this section, we will apply everything weve learned about the UART peripheral to develop a driver 
for transmitting data using the USART2 peripheral. 


Let’s begin by identifying the GPIO pins connected to the UART2 peripheral. To do this, refer to the 
table on page 39 of the STM32F411RE datasheet. This table lists all the GPIO pins of the microcontroller, 
along with their descriptions and additional functionalities. As shown in Figure 10.3, part of this table 
reveals that PA1 has an alternate function labeled as USART2_ TX: 


TIM2_CH3, 
TIM5_CH3, 
TIM9_CH1, 


12S2_CKIN, 
USART2_TX, 
EVENTOUT 


Figure 10.3: The USART2_TX pin 


To use PA2 as the USART2_ TX line, we need to configure PA2 as an alternate function pin in the 
GPIOA_MODER register and then specify the alternate function number for USART2_ TX in the 
GPIOA_AFRL register. The STM32F4 microcontroller allows us to choose from 16 different alternate 
functions, numbered from AF00 to AF15. The alternate function mapping table, which you can find 
on page 47 of the datasheet, outlines these functions and their corresponding numbers. As shown 
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in Figure 10.4, sourced from the datasheet, configuring PA2 as AF07 will set it to function as the 
USART2_ TX line: 


AF05 AF06 | AF07 | AF08 AF13 


SP12/12S2/ 
SPHA2SIS| ` Spi | SPI3/2S3/ 
12S3/SP14/ USART1/ | USART6 OTG1_FS 
I2S4/SPI5/ USART2 


1285 


12C1/12C2/ 
TIM1/TIM2 12¢3 


TIM3/ 
TIM4/ TIMS 


PI2/ 
12S2/SP13/ 
1283 


TIM2_CH1/ 


USART2_ 
TIM2_ETR | TIMS_CH1 CTS 


SP14_MOSI . USART2_ 
N2S4_SD RTS 


TIM2_CH3 | TIMS_CH3 | TIM9_CH1 - 12S2_CKIN . USART2_ 


Figure 10.4: PA2 alternate function 


TIM2_CH2 | TIM5_CH2 


We now have all the information we need to develop the UART?2 transmitter driver. 


Create a copy of your previous project and rename it UART. Next, create a new file named uart . c 
in the Src folder and another file named uart .h in the Inc folder. Populate your uart . c file 
with the following code: 


#include <stdint.h> 
#include "uart.h" 


#define GPIOAEN (1U<<0) 

#define UART2EN (1U<<17) 

#define DBG UART_BAUDRATE 115200 
#define SYS FREQ 16000000 
#define APB1 CLK SYS_FREQ 
#define CR1 TE (U3) 
#define CR1 UE (1U<<13) 
#define SR_TXE (1U<<7) 


static void uart set baudrate(uint32 t periph _clk,uint32_t baudrate) ; 
static void uart_write(int ch); 


int io putchar (int ch) 


{ 


uart_write (ch); 
return ch; 


void uart _init (void) 


{ 
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/*Enable clock access to GPIOA*/ 
RCC- >AHB1ENR le GPIOAEN; 


/*Set the mode of PA2 to alternate function mode*/ 
GPIOA->MODER &=~(1U<<4); 


GPIOA->MODER |=(1U<<5) ; 


/*Set alternate function type to AF7(UART2_TX) */ 


GPIOA->AFR[0] |=(1U<<8) ; 
GPIOA->AFR[0] |=(1U<<9) ; 
GPIOA->AFR[0] |=(1U<<10); 


GPIOA->AFR[0] &=~(1U<<11) ; 


/*Enable clock access to UART2*/ 
RCC->APB1ENR l= UART2EN; 


/*Configure uart baudrate*/ 
uart_set_baudrate(APB1 CLK,DBG UART BAUDRATE) ; 


/*Configure transfer direction*/ 
USART2->CR1 = CR1_ TE; 


/*Enable UART Module*/ 
USART2->CR1 |= CR1_UE; 


static void uart_write(int ch) 


{ 


/*Make sure transmit data register is empty*/ 
while (! (USART2->SR & SR_TXE)) {} 


/*Write to transmit data register*/ 
USART2->DR =(ch & OXFF) ; 


} 


static uint16 t compute uart bd(uint32 t periph _clk,uint32 t baudrate) 


{ 


return( (periph clk + (baudrate/2U) ) /baudrate) ; 


static void uart set baudrate(uint32_ t periph_clk,uint32_ t baudrate) 
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USART2->BRR = compute _uart_bd(periph_clk,baudrate) ; 


} 


Let’s break it down. 


First, we have the necessary includes and macros. 


#include <stdint.h> 


#include 


#define 
#define 


#define 
#define 
#define 
#define 
#define 
#define 


viart AU 


GPIOAEN (1U<<0) 
UART2EN (1U<<17) 


DBG_UART_BAUDRATE 115200 
SYS_FREQ 16000000 

APB1 CLK SYS _ FREQ 

CR1_ TE (1U<<3) 

CR1_UE (1U<<13) 

SR_TXE (1U<<7) 


Here are the uses of the macros: 


e GPIOAEN: This macro enables the clock for GPIOA by setting bit 0 in the AHB1ENR register. 


e UART2EN: This macro enables the clock for UART2 by setting bit 17 in the APB1ENR register. 


e DBG_UART_BAUDRATE: This macro defines the baud rate for UART communication, set to 
115200 bps. 


e SYS FREQ: This macro defines the system frequency, set to 16 MHz, and the default frequency 
of the STM32F411 microcontroller on the NUCLEO development board. 


e APB1_ CLK: This macro sets the APB1 peripheral clock frequency to the system frequency 
(16 MHz). 


e CR1_TE: This macro enables the transmitter by setting bit 3 in the USART_CR1 register. 


e CR1_UE: This macro enables the UART module by setting bit 13 in the USART_CR1 register. 


e SR_TXE: This macro represents the TXE bit in the USART_SR register. 


Next, we have the helper functions for computing and setting the baud rate: 


static uint1l6_t compute _uart_bd(uint32_t periph clk, uint32 t 
baudrate) 


{ 


return ((periph_clk + 


(baudrate / 2U)) 


/ baudrate) ; 
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This helper function calculates the baud rate divisor. It uses the peripheral clock and desired baud 
rate to compute the value to be set in the Baud Rate Register (BRR): 


Static volduant acct baudrate (Wint 2 mts perl pheclicn uants 2m baudrate)) 


{ 
} 


This function sets the baud rate for UART2 by writing the computed divisor to the BRR. Let’s turn 
our focus to the initialization function: 


USART2->BRR = compute_uart_bd(periph_clk, baudrate) ; 


RCC- >AHB1ENR |= GPIOAI 


G 


N; 


This line enables the clock for GPIOA by setting the appropriate bit in the AHB1 peripheral clock 
enable register: 


GPIOA->MODER &= ~(1U << 4); 
GPIOA->MODER |= (1U << 5); 


These lines configure pin PA2 to operate in alternate function mode, which is necessary for 
UART functionality: 


GPIOA->AFR[0] |= (1U << 8); 
GPIOA->AFR[0] |= (1U << 9); 
GPIOA->AFR[0] |= (1U << 10); 
GPIOA->AFR[0] &= ~(1U << 11); 


These lines configure PA2 as an alternate function (AF7), which corresponds to UART2_ TX: 


RCC->APB1ENR |= UART2EN; 


This line enables the clock for UART2 by setting the appropriate bit in the APB1 peripheral clock 
enable register: 


uart_set_baudrate(APB1 CLK, DBG UART_BAUDRATE) ; 


This function call sets the baud rate for UART2 using the uart_set_baudrate () function: 


USART2->CR1 = CR1_ TE; 


This configures UART2 for transmission by setting the transmitter enable bit in the control register: 


USART2->CR1 |= CR1_UE; 


This enables the UART2 module by setting the UART enable bit in the control register. 


Developing the UART driver 


Next, we have the function for writing to UART: 


static void uart_write(int ch) 


/* Make sure transmit data register is empty */ 
while (! (USART2->SR & SR_TXE)) {} 

/* Write to transmit data register */ 
USART2->DR = (ch & OXFF); 


Let’s break it down: 


while (! (USART2->SR & SR_TXE)) {} 


This loop ensures that the transmit data register is empty before we write new data: 


USART2->DR = (ch & OXxFF) ; 


This line writes the character to the data register for transmission. 
Finally, we have a useful function that allows us to redirect printf output to our UART transmitter: 


ime, alo) putchar ((akialic, (ellal)) 


{ 


uart_write(ch) ; 
return ch; 


} 


It calls uart_write() to send the character and then returns the character. 
After sending the character, _ io _putchar returns the same character, ch. 


Returning the character is a standard practice, allowing the function to comply with the typical 
putchar function signature, which returns the character written as an int variable. 


Our next task is to populate the uart .h file. Here’s the code: 


#ifndef UART H __ 
#define UART H __ 
#include "stm32f4xx.h" 
void uart init (void); 
#endif 
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Here, we are simply exposing the uart initialization function implemented in uart . c, making it callable 
from other files. We are now ready to test our driver in main.c. Update your main.c file, like so: 


#include <stdio.h> 
#include "uart.h" 
int main (void) 


{ 
/*Initialize debug UART*/ 
teree iae) 5 
while (1) 
{ 
printf ("Hello from STM32...\r\n"); 
} 
} 


This main function simply initializes the UART2 peripheral and then continuously prints the sentence 
Hello from STM32.... 


Let’s test the project. To do so, we'll need to install a program on our computer that can display the 
data that’s received through the computer's serial port. In this setup, our development board acts as 
the transmitter, while the computer is the receiver. 


1. Install a serial terminal program: 


* Choose a serial terminal program that’s appropriate for your operating system. Options 
include Realterm, Tera Term, Hercules, and Cool Term. 


: If yowre using Windows, I recommend Realterm. You can download it from 
SourceForge: https: //sourceforge.net/projects/realterm/. 


" Follow the installation wizard to complete the setup. 
2. Prepare to identify your development board’s serial port: 


I. Disconnect your development board from your computer. 
II. Open Realterm and navigate to the Port tab. 


HI. Click on the Port drop-down menu; you'll see a list of available ports. Since your 
development board is currently disconnected, its port won't appear in the list. Take 
note of the listed ports. 
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3. Identify the development board’s port: 


I. Close Realterm and connect your development board to the computer. 


II. Reopen Realterm and go back to the Port drop-down menu. You should now see a new 
port in the list, which corresponds to your development board. 


HI. Select this newly added port. 


4. Set the baud rate: 


Click the Baud drop-down menu and select 115200. This is the baud rate we configured in 
our driver. 


5. Build and run the project: 
Return to your IDE, build the project, and run the firmware on your microcontroller. 


6. Test the setup: 


" Go back to Realterm and click the Open button to start the communication. 


" You should see a message stating Hello from STM32... continuously being printed in 
the Terminal window. 


Figure 10.5 shows the settings described for Realterm: 


fs RealTerm: Serial Capture Program 2.0.0.70 = o x 


| 120-2 | 12CMisc | Misc | \n| Clear Freeze| ?| 


Status 


Software Flow Control EJ RXD (2) 


Display Port | Capture | Pins | Send | Echo Pott | 12C 


a ex oym elec o 
m Pod || C 7 bits | -Hardware Flow Control Transmit Xoff Char: |19 DCD (1) 
C Mak || © Sbits| | @ None C RTS/CTS i DSR (6) 
C Space | C 5bits| | C DTR/DSR C RS485-ts C Re Ring (9) 
@ Telnet BREAK 

Error 


Char Count:575684 |CPS:11950 Port: 8 115200 8N1 None 


Figure 10.5: Realterm settings 
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Summary 


In this chapter, we learned about the UART protocol, a fundamental communication method that’s 
widely used in embedded systems. We began by discussing the importance of communication 
protocols in embedded systems, emphasizing how UART, alongside SPI and I2C, facilitates seamless 
communication between microcontrollers and peripheral devices. 


Next, we provided a detailed overview of the UART protocol while covering its operational principles, 
including how data is transmitted asynchronously using start and stop bits, and the role of parity in 
error checking. We also discussed how the baud rate, a critical aspect of UART communication, is 
configured to ensure synchronized data transfer between devices. 


Then, we delved into the specifics of the STM32 UART peripheral, examining key registers such as the 
Status Register (USART_SR), Data Register (USART_DR), Baud Rate Register (USART_BRR), and 
Control Register 1 (USART_CR1). Understanding these registers is essential for configuring UART 
for effective communication in STM32 microcontrollers. 


Finally, we applied our theoretical understanding by developing a bare-metal UART driver for the 
STM32F4 microcontroller. This involved initializing the UART peripheral, setting the baud rate, and 
implementing functions for transmitting data. We also demonstrated how to redirect printf output 
to the UART, enabling easy debugging and data logging through a serial terminal. 


In the next chapter, we will learn about the analog-to-digital converter (ADC). 
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In this chapter, we will learn about the analog-to-digital converter (ADC), an important peripheral 
in embedded systems that enables the microcontroller to interface with the analog world. We will 
start by providing an overview of the analog-to-digital conversion process, the importance of the 
ADC, and its key specifications. 


Following this, we will extract and analyze the relevant registers from the STM32F411 reference 
manual that are necessary for ADC operations. Finally, we will develop a bare-metal ADC driver to 
demonstrate the practical application of the theoretical concepts weve discussed. 


In this chapter, were going to cover the following main topics: 
e Overview of analog-to-digital conversion 
e The STM32F4 ADC peripheral 
e The key ADC registers and flags 
e Developing the ADC driver 


By the end of this chapter, you will have a comprehensive understanding of the STM32 ADC peripheral 
and the skills necessary to develop efficient drivers for it, enabling you to effectively integrate analog- 
to-digital conversion capabilities into your embedded systems projects. 


Technical requirements 


All the code examples for this chapter can be found on GitHub at https: //github.com/ 
Packt Publishing/Bare-Metal -Embedded-C- Programming. 
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Overview of analog-to-digital conversion 


ADC is a critical process in embedded systems that allows our microcontrollers to interpret and 
process real-world analog signals. In this section, we will walk through this process, explaining each 
step involved in converting an analog signal into digital values. 


What is analog-to-digital conversion? 


Analog-to-digital conversion is the process of converting a continuous analog signal into a discrete 
digital representation. Analog signals, which can have any value within a certain range, are transformed 
into digital signals, which have specific, quantized levels. This conversion is essential because 
microcontrollers and digital systems can only process digital data. 


The conversion process typically involves several key steps: sampling, quantization, and encoding. 
Let’s break down these steps, starting with sampling. 
Sampling 


Sampling involves measuring the amplitude of an analog signal at regular intervals, called sampling 
intervals. The result is a series of discrete values that approximate the original analog signal. Figure 11.1 
depicts this process: 


Analog Signal Digital Signal 


amplitude 


amplitude 


time 


%\__ sampling period, or 
sampling interval 


Figure 11.1: The sampling process 


The rate at which the analog signal is sampled is known as the sampling rate or sampling frequency. 
This is typically measured in samples per second (Hz). According to the Nyquist Theorem, the 
sampling rate must be at least twice the highest frequency present in the analog signal to accurately 
reconstruct the original signal. 


The next step in the process is quantization. 
Quantization 


Quantization is the process of mapping the sampled analog values to the nearest discrete levels 
available in the digital domain. Each discrete level corresponds to a unique digital code. 
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The number of discrete levels available for quantization is determined by the resolution of the analog- 
to-digital converter. For example, an 8-bit ADC has 256 levels (28), while a 12-bit ADC has 4096 
levels (212). 


The quantization process inherently introduces an error, known as quantization error or quantization 
noise, because the exact analog value is approximated to the nearest digital level. We can minimize 
this error by increasing the resolution of the ADC. For example, if an analog signal ranges from 0 to 
3.3V and an 8-bit ADC is used, the quantization step size is approximately 12.9 mV (3.3V/ 256). An 
analog input of 1.5V might be quantized to the closest digital level, which could be slightly higher or 
lower than 1.5V. 


The final step in the process is encoding. 
Encoding 


Encoding is the final step and is where the quantized levels are converted into a binary code that 
can be processed by the digital system. Each quantized level is represented by a unique binary value. 


The number of bits used in the ADC determines the binary code length. For example, a 10-bit ADC 
will produce a 10-bit binary number for each sampled value. Continuing with our previous example, 
if the quantized level for 1.5V is determined to be level 116, the binary representation would be 
01110100 for an 8-bit ADC. 


Figure 11.2 shows the encoding process of a 6-bit ADC. The columns in the table show the 6-bit binary 
representation of the quantization levels. For a 6-bit ADC, the digital output ranges from 000000 for 
the lowest quantization level to 111111 for the highest. Each binary value corresponds to a specific 
quantization level: 


t b5 b4 b3 b2 b1 bO 
Analog Signal 
0 1 
amplitude e © S vag a (as S A [— b5 2 0 0 1 1 1 0 
' —b4 
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Figure 11.2: The encoding process 


In summary, the analog-to-digital conversion process begins with an analog input signal, which can 
vary continuously over time. This signal could be a voltage from a temperature sensor, an audio signal, 
or any other analog signal. The analog-to-digital converter typically includes a sample-and-hold circuit 
that captures and holds the analog signal at each sampling interval. This ensures that the signal remains 
constant during the conversion process. The core of the ADC performs quantization and encoding. It 
compares the held analog value to a set of reference voltages to determine the closest matching digital 
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level. The resulting digital code is output from the ADC and can be read by our microcontroller or 
digital system for further processing. 


In the next section, we will explain some of the key terms that were used in this section, including 
resolution and reference voltage (VREF). 


Key specifications of the ADC - resolution, step size, and VREF 


To effectively use ADCs, it’s important to understand their key specifications, which define their 
performance for various applications. Let’s start with resolution. 


Resolution 


The resolution of an ADC determines the number of distinct output levels it can produce, corresponding 
to the number of intervals the input voltage range is divided into. It is typically expressed in bits. 
Higher resolution allows us to have a more precise representation of the analog input signal, reducing 
quantization error and improving the accuracy of measurements. 


For an N-bit ADC, the number of discrete output levels is 2N. For example, an 8-bit ADC has 256(28) 
levels, while a 12-bit ADC has 4,096 levels. 


Table 11.1 shows common ADC resolutions and their corresponding number of discrete levels. The 
following table highlights how the number of discrete levels increases exponentially with the resolution, 
providing finer granularity in the digital representation of the analog input signal: 


ADC Resolution (bits) Number of Discrete Levels (24N) 
8 256 

10 1,024 

12 4,096 

14 16,384 

16 65,536 

18 262,144 

20 1,048,576 

24 16,777,216 


Table 11.1: Common ADC resolutions 


The next key specification is the VREF. 


Overview of analog-to-digital conversion 


VREF 


The VREF is the maximum voltage that our ADC can convert. The analog input voltage is compared 
to this VREF to produce a digital value. The stability and accuracy of the VREF directly impact the 
accuracy of the ADC as any fluctuations in the VREF can cause corresponding errors in the digital output. 


We can choose to derive the VREF from the microcontroller or provide an external one for more 
precise applications. Internal references are convenient but might have higher variability, while external 
references can offer better stability and accuracy. The choice of VREF depends on our application's 
accuracy requirements and the nature of the analog signal being measured. 


For example, if the VREF is 5V, the ADC can accurately convert any analog input signal within the 
range of OV to 5V. 


The last specification we'll examine is the step size. 
Step size 


The step size is the smallest change in analog input that can be distinguished by the ADC. It is determined 
by the VRED and the resolution and it determines the granularity of the ADC’s output. A smaller 
step size indicates finer resolution, allowing the ADC to detect smaller changes in the input signal. 


The step size is calculated by dividing the VREF by 2 raised to the power of the ADCS resolution 
(number of bits): 


VREF 
2N 


Step size= 
Here, VREF is the reference voltage and N is the resolution bits. 


For example, for a 10-bit ADC with VREF = 3.3V: 


o 33V _ 33V _ 
Step size= 30 = Ioa = 3.22mV 


This means that each increment in the quantized digital output corresponds to a 3.22mV change in 
the analog input. Table 11.2 lists common ADC resolutions, the corresponding number of steps, and 
the step size using a VREF of 3.3V: 


ADC Resolution (Bits) Number of Steps (24N) | Step Size (mV) 


8 256 12.9 
10 1,024 3.22 
12 4,096 0.805 
14 16,384 0.201 


16 65,536 0.0504 
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ADC Resolution (Bits) Number of Steps (24N) | Step Size (mV) 


18 262,144 0.0126 
20 1,048,576 0.0032 
24 16,777,216 0.000197 


Table 11.2: ADC resolutions and step sizes at 3.3V VREF 


This table provides a clear view of how the step size decreases with increasing resolution, allowing for 
finer granularity in the digital representation of the analog input. 


This concludes our overview of the analog-to-digital conversion process. In the next section, we will 
examine the ADC peripheral of our STM32 microcontroller. 


The STM32F4 ADC peripheral 


Our STM32F411 microcontroller features a 12-bit ADC capable of measuring signals from up to 19 
multiplexed channels. The ADC can operate in various modes, such as single, continuous, scan, or 
discontinuous, with the results stored in a 16-bit data register. Additionally, the ADC has an analog 
watchdog feature that allows the system to detect when the input voltage exceeds predefined thresholds. 


Before we explain the various ADC modes, let’s understand what we mean by ADC channels. 


The ADC channels 


An ADC channel is a dedicated pathway through which an analog signal is fed into the ADC so 
that it can be converted into a digital value. Each ADC channel corresponds to a specific GPIO pin 
configured to operate in analog mode. 


Sensors, which produce analog signals representing physical phenomena (such as temperature, light, 
or pressure), are interfaced with our microcontroller through these GPIO pins. By configuring a 
GPIO pin as an analog input, the microcontroller can receive the sensor’s analog output signal on the 
corresponding ADC channel. The ADC then converts this continuous analog signal into a discrete 
digital representation that the microcontroller can process, analyze, and use for further decision- 
making tasks in our embedded systems applications. 


You might be wondering, does having 19 channels mean we have 19 separate ADC modules? This is 
where multiplexing comes into play. 


The STM32F4 ADC peripheral 


Multiplexing ADC channels 


Multiplexing allows the ADC to switch between different input signals, sampling each one sequentially. 
This is achieved using an analog multiplexer (MUX) within the ADC peripheral. As we learned 
earlier, each of the ADC channels is connected to a specific GPIO pin configured for analog input. 
The analog MUX selects which analog input signal (from the GPIO pins or internal sources) is 
connected to the ADC’s sampling circuitry at any given time. This selection is controlled by the ADC’s 
configuration registers. 


Figure 11.3 shows the ADC channels’ connection to the analog multiplexor within the ADC 
peripheral block: 


ADC1_INO 
ADC1_IN1 
ADC1_IN2 
ADC1_IN3 
ADC1_IN4 


. MUX ADC 


ADC1_IN16 


ADC1_IN15 


Figure 11.3: ADC channel multiplexing 


Now, lets examine the available ADC modes. 
The ADC modes 


The ADC in our STM32F411 microcontroller can operate in several modes, each tailored to specific 
application requirements. The primary modes of operation include single conversion mode, continuous 
conversion mode, scan mode, discontinuous mode, and injected conversion mode. 


Let’s break them down: 


e Single conversion mode: In this mode, the ADC performs a single conversion of the selected 
channel and then stops. This mode is useful for applications where periodic or event-driven 
sampling of an analog signal is required. To select this mode, we have to set the CONT bit to 
0 in the ADC_CR2 register. 


Example use case: Reading the value from a temperature sensor at specific intervals. 
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e Continuous conversion mode: This mode allows the ADC to repeatedly convert the input 
signal. After each conversion, the next conversion starts automatically. This mode is enabled 
by setting the CONT bit to 1 in the ADC_CR2 register. 


Example Use Case: Continuously monitoring a potentiometer to track its position in real time. 


e Scan mode: This mode is used to convert a sequence of analog channels. We use this mode 
when we need to sample multiple signals in a defined order. This mode is enabled by setting 
the SCAN bit in the ADC_CR1 register and then configuring the sequence of channels in the 
ADC_SORx registers. If the CONT bit is set in ADC_CR2, the sequence restarts after the last 
channel is converted. 


Example use case: Sampling multiple sensor inputs in a data acquisition system. 


¢ Discontinuous mode: This mode allows us to convert a subset of channels within a sequence. 
It reduces the number of channels converted in each sequence, which can be useful for power 
saving or reducing processing load. This mode can be enabled by setting the DISCEN bit in 
the ADC_CR1 register and then defining the number of channels to convert in each group by 
setting the DISCNUM bits in the ADC_CR1 register. 


Example use case: Reducing the sampling rate for a subset of channels in a multi-channel system. 


e Injected conversion mode: This mode is designed for higher-priority conversions that can 
interrupt regular conversions and is useful for applications that require precise timing for 
specific signal measurements. We can use this mode by configuring the injected channels 
and their sequence in the ADC_JSQR register, and then starting the conversion by setting the 
JSWSTART bit in the ADC_CR2 register or via an external trigger. 


Example use case: Prioritizing critical cell voltage measurements in a battery management 
system (BMS) during rapid charging or discharging. 


Before exploring the common ADC registers, let’s understand the two types of channels available in 
the STM32F411 microcontroller. 


Understanding regular channels versus injected channels in 
STM32F411 ADC 


In the STM32F411 microcontroller, the ADC offers a versatile approach to handling multiple analog 
inputs through two main types of channels: regular channels and injected channels. Regular channels are 
configured for routine, sequential conversions, ideal for periodic data acquisition from sensors where 
timing is not extremely critical. These channels follow a predefined sequence set by the ADC_SORx 
registers and can be triggered by software or external events. 


The key ADC registers and flags 


In contrast, injected channels, such as those configured in injected conversion mode, are designed 
for high-priority, time-sensitive tasks, interrupting the regular sequence to perform immediate 
conversions when specific conditions are met. This makes injected channels perfect for capturing 
critical measurements with precise timing, such as motor current sensing in control applications. 
Additionally, the ADC includes an Analog Watchdog feature, which can monitor both regular and 
injected channels for values that exceed predefined thresholds, generating interrupts to handle out-of- 
range conditions. This dual-channel capability, combined with the Analog Watchdog, provides a robust 
framework for diverse applications, from routine environmental monitoring to critical real-time data 
processing and safety monitoring. 


In the next section, we will examine the key registers of the ADC peripheral and some of the flags 


associated with the ADC operations. 


The key ADC registers and flags 


In this section, we will explore the characteristics and functions of some of the crucial registers within 
the ADC peripheral. 


Let's start with ADC Control Register 1 (ADC_CR1). 


ADC Control Register 1 (ADC_CR1) 


This is one of the main control registers that’s used to configure the ADC’s operational settings. It 
provides various configuration options, such as resolution, scan mode, discontinuous mode, and 
interrupt enable. 


The following are the key bits in this register: 


e RES[1:0] (resolution bits): These bits set the resolution of the ADC (12-bit, 10-bit, 8-bit, or 6-bit) 


e SCAN (scan mode): Setting this bit enables scan mode, allowing the ADC to convert multiple 
channels in sequence 


e DISCEN (discontinuous mode): When set, this bit enables discontinuous mode on regular channels 
e AWDEN (Analog Watchdog enable): This bit enables the Analog Watchdog on all regular channels 


e FOCIE (end of conversion interrupt enable): When set, this bit allows an interrupt to be 
generated when the EOC flag is set 


You can find detailed information about this register on page 229 of the STM32F411 reference 
manual (RM0383). 


Next, we have ADC Control Register 2 (ADC_CR2). 
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ADC Control Register 2 (ADC_CR2) 


This is another crucial control register that handles different aspects of ADC operation, including the 
start of conversion, data alignment, and external triggers. 


Here are the key bits in this register: 
« ADON (ADC on): This bit turns the ADC on or off 
e CONT (continuous conversion): Setting this bit enables continuous conversion mode 


e SWSTART (start conversion of regular channels): Setting this bit starts the conversion of 
regular channels 


e ALIGN (data alignment): This bit sets the alignment of the converted data (right or left) 
e EXTEN[I1:0]: This is an external trigger that enables polarity selection for regular channels 
Further information about this register can be found on page 231 of the reference manual. 


Let’s move on to the ADC Regular Sequence Register (ADC_SQRx). 


ADC Regular Sequence Register (ADC_SQRx) 


The ADC_SQRx registers define the sequence in which the ADC converts the channels. There are 
multiple SQR registers to handle the sequence for up to 16 regular channels. 


Here are the key bits in this register: 


e L[3:0]: Regular channel sequence length. These bits set the total number of conversions in the 
regular sequence. 


e SQ1-SQ16: Regular channel sequence. These bits specify the order of the channels to be converted. 
You can read more about this register on page 235 of the reference. The next crucial register is the 


ADC Data Register (ADC_DR) 


ADC Data Register (ADC_DR) 


The ADC_DR register holds the result of the conversion. This is where the digital representation of 
the analog input is stored after the conversion is complete. The register is read-only and the data is 
stored in the lower 16 bits of the register. 


The final register we will examine is the ADC Status Register (ADC_SR). 
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ADC Status Register (ADC_SR) 


This register holds various status flags that indicate the state of the ADC. These flags are essential for 
monitoring the ADC’s operation and handling interrupts. We'll examine these flags in the next section. 


The key ADC flags 


ADC flags are status indicators that inform the system about the state of the ADC operations. These 
flags are essential for monitoring the ADC’s progress, handling interrupts, and managing errors. 


The key ADC flags in the STM32F411 are as follows: 


e End of conversion (EOC) flag: The EOC flag indicates that the ADC has completed a conversion, 
and the result is available in the data register. It is located in the ADC_SR register at bit position 
1 (EOC) and is set by hardware when a regular conversion finishes. 


If the EOCIE bit in the ADC_CR1 register is set, the EOC flag can trigger an interrupt. In this 
case, an interrupt service routine can be triggered to process the converted data. 


e End of injected conversion (JEOC) flag: This flag indicates that an injected conversion sequence 
has been completed and the result is available in the injected data register. It is located in the 
ADC_SR register at bit position 2 (JEOC) and is set by hardware at the end of the conversion 
of all injected channels in the group. Similar to EOC, this flag can generate an interrupt if its 
corresponding interrupt enable bit (JEOCIE) is set in the ADC_CR1 register. 


e Analog Watchdog (AWD) flag: This flag indicates that the converted value has exceeded the 
predefined high or low thresholds set for the analog watchdog. It is located in the ADC_SR 
register at bit position 0 (AWD). It can also generate an interrupt if its corresponding interrupt 
enable bit (AWDIE) is set in the ADC_CR1 register. 


e Overrun (OVR) flag: This flag indicates that a new conversion result has overwritten the 
previous data before it was read. It is located in the ADC_ SR register at bit position 5 (OVR). 
It can also generate an interrupt if its corresponding interrupt enable bit (OVRIE) is set in the 
ADC _CR1 register. 


e Start conversion (STRT) flag: The STRT flag indicates that an ADC conversion has started. 
We can use this flag to verify that the ADC has initiated a conversion process. 


Understanding and effectively using ADC flags is crucial for managing ADC operations in our STM32 
microcontroller. Flags such as EOC, JEOC, AWD, OVR, and STRT provide essential information about 
the status of conversions, data integrity, and threshold monitoring. By leveraging these flags, we can 
enhance the reliability and functionality of our ADC implementations, ensuring accurate and timely 
data acquisition and processing in our embedded systems projects. 


In the next section, we will apply the information we've learned to develop an ADC driver for reading 
analog sensor values. 
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Developing the ADC driver 


In this section, we will apply everything we have learned about the ADC peripheral to develop a driver 
for reading sensor values from a sensor connected to one of the ADC channels. 


Identifying the GPIO pins for the ADC 


Let’s begin by identifying the GPIO pins connected to the ADC channels. To do this, refer to the table 
on page 39 of the STM32F411RE datasheet. This table lists all the GPIO pins of the microcontroller, 
along with their descriptions and additional functionalities. As shown in Figure 11.4, part of this 
table reveals that PA1 has an additional function labeled as ADC1_IN1. This indicates that PA1 is 
connected to ADC1, channel 1: 


Table 8. STM32F411xC/xE pin definitions (continued) 
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USART2_CTS, 
EVENTOUT 
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TIM2_CH2, 
TIM5_CH2, 
11 | 15 | G7 | 24 | m2 |PA1 O |FT| - |SPI4_MOSI/I2S4_SD, | ADC1_1 
USART2_RTS, 
EVENTOUT 


Figure 11.4: Pin definitions 


Let’s configure PA1 so that it functions as an ADC pin. 


First, create a copy of your previous project in your IDE, following the steps outlined in earlier chapters. 
Rename this copied project to ADC. Next, create a new file named adc .c in the Src folder and 
another file named adc. hin the Inc folder. 


Populate your adc . c file with the following code: 


#include "adc.h" 


#define GPIOAEN (1U<<0) 
#define ADC1EN (1U<<8) 
#define ADC CH1 (1U<<0) 


#define ADC SEQ LEN 1 0x00 
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#define CR2 ADCON (1U<<0) 
#define CR2 CONT (1U<<1) 
#define CR2 SWSTART (1U<<30) 
#define SR EOC (1U<<1) 


void pal_adec_ init (void) 
/****Configure the ADC GPIO Pin**/ 
/*Enable clock access to GPIOA*/ 
RCC->AHB1ENR |= GPIOAEN; 


/*Set PA1 mode to analog mode*/ 
GPIOA->MODER |=(1U<<2) ; 
GPIOA->MODER |= (10223) 5 


/****Configure the ADC Module**/ 
/*Enable clock access to the ADC module*/ 
RCC->APB2ENR | =ADC1EN; 


/*Set conversion sequence start*/ 
ADC1->SQR3 = ADC CH1; 


/*Set conversion sequence length*/ 
ADC1->SQR1 = ADC SEQ LEN 1; 


/*Enable ADC module*/ 
ADC1->CR2 |=CR2_ADCON; 


void start _ conversion (void) 


{ 


/*Enable continuous conversion*/ 
ADC1->CR2 |=CR2_CONT; 


/*Start ADC conversion*/ 
ADC1->CR2 |=CR2_SWSTART; 


uint32_t adc_read (void) 


{ 
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} 


/*Wait for conversion to be complete*/ 
while(!(ADC1->SR & SR_EOC)) {} 


/*Read converted value*/ 
return (ADC1->DR); 


Let’s break down the source code, starting with the macro definitions: 


#define GPIOAEN (1U<<0) 
#define ADC1EN (1U<<8) 
#define ADC CH1 (1U<<0) 
#define ADC SEQ LEN 1 0x00 
#define CR2 ADCON (1U<<0) 
#define CR2 CONT (1U<<1) 
#define CR2 SWSTART (1U<<30) 
#define SR_EOC (1U<<1) 


Let’s break down the macros: 


GPIOAEN: This macro enables the clock for GPIOA by setting bit 0 in the AHB1ENR register 
ADC1EN: This enables the clock for ADC1 by setting bit 8 in the APB2ENR register 
ADC_CH1: This selects channel 1 for the ADC conversion in the SQR3 register 
ADC_SEQ_LEN_1: This sets the conversion sequence length to 1 in the SQR1 register 
CR2_ADCON: This enables the ADC module by setting bit 0 in the CR2 register 

CR2_ CONT: This enables continuous conversion mode by setting bit 1 in the CR2 register 
CR2_SWSTART: This starts the ADC conversion by setting bit 30 in the CR2 register 


SR_EOC: This macro waits for the end of conversion by reading bit 1 in the status register (SR) 


Next, we must analyze the configuration sequence of the GPIO pin that’s used for ADC functionality: 


/* Enable clock access to GPIOA */ 


RCC->AHB1ENR |= GPIOAEN; 
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This line enables the clock for GPIOA by setting the appropriate bit in the AHB1ENR register using 
the GPIOAEN macro: 


/* Set PA1 mode to analog mode */ 
GPIOA->MODER le (1U<<2) ; 
GPIOA->MODER le (1U<<3) ; 


These lines configure PA1 as an analog input by setting bits 2 and 3 in the GPIOA_MODER register. 
Let’s move on to the part of the code that configures the ADC parameters: 


/* Enable clock access to the ADC module */ 
RCC->APB2ENR le ADC1EN; 


This line enables the clock for ADC1 by setting the appropriate bit in the APB2ENR register using 
the ADCIEN macro. 


/* Set conversion sequence start */ 
ADC1->SQR3 = ADC CH1; 


This line sets channel 1 as the start of the conversion sequence in the ADC_SQR3 register using the 
ADC_CH1 macro: 


/* Set conversion sequence length */ 
ADC1->SQR1 = ADC SEQ LEN 1; 


This line sets the sequence length to 1 in the ADC_SQR1 register using the ADC_SEQ_LEN_1 macro, 
meaning only one channel will be converted: 


/* Enable ADC module */ 
ADC1->CR2 |= CR2_ADCON; 


This line enables the ADC module by setting the ADCON bit in the ADC_CR2 register. 
Next, we can start the conversion: 


/* Enable continuous conversion */ 
ADC1->CR2 |= CR2_CONT; 


This line enables continuous conversion mode by setting the CONT bit in the ADC_CR2 register 
using the CR2_CONT macro: 


/* Start ADC conversion */ 
ADC1->CR2 |= CR2_SWSTART; 
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This line starts the ADC conversion by setting the SWSTART bit in the ADC_CR2 register using the 
CR2_SWSTART macro. 


Next, we must wait for the results to be ready: 


/* Wait for conversion to be complete */ 
while (!(ADC1->SR & SR_EOC)) {} 


This line waits until the conversion is complete by checking the EOC flag in the ADC_SR register: 


/* Read converted value */ 
return (ADC1->DR) ; 


This line reads the converted digital value from the ADC_DR register. 
In summary, our code performs the following actions: 
1. Initializes the ADC GPIO pin: 
" Enables the clock for GPIOA 
" Sets PA1 to analog mode 
2. Configures the ADC module: 
" Enables the clock for ADC1 
* Sets channel 1 as the start of the conversion sequence 
* Sets the conversion sequence’s length to 1 
" Enables the ADC module 
3. Starts the ADC conversion process: 
" Enables continuous conversion mode 
" Starts the ADC conversion process 
4. Reads the ADC value: 
* Waits for the conversion to complete 
" Reads the converted value from the ADC data register 
Our next task is to populate the adc . h file. Here’s the code: 


#ifndef ADC H _ 
#define ADC H _ 
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#include <stdint.h> 
#include "stm32f4xx.h" 


void pal_ade init (void); 
void start conversion (void); 
uint32 t ade _read(void) ; 


#endif£ 


Here, we are simply exposing the functions implemented in adc . c, making them callable from 
other files. 


Let’s move on to the main.. c file. Update your main . c file, like so: 


#include <stdio.h> 
#include "adc.h" 
#include "uart.h" 


int sensor value; 


int main (void) 


{ 


/*Initialize debug UART*/ 
kewaie,_aliaialic (()) F 


/*Initialize ADC*/ 
pal_adec_init(); 


/*Start conversion*/ 
start _conversion() ; 


while (1) 
{ 


sensor value = adc_read(); 


printf ("Sensor Value: %d\r\n",sensor value) ; 
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Let’s break it down: 


¢ Including header files: 


#include <stdio.h> 
finelude "ade hw" 


#include "uart.h" 


Let’s take a closer look: 


* #include <stdio.h>: This includes the standard input/output library, which provides 
the printf () function for printing the sensor values 


" #include "adc.h": This includes the header file for the ADC functions, ensuring 
that the pal_adc_init, start_conversion, and adc_read functions from our 
adc.c file are available 


* #include "uart.h": This includes the header file for the UART functions we developed 
in the previous chapter, ensuring that the uart_init function is available 


e Global variable declaration: 
int sensor value; 
This declares a global variable to store the ADC value that’s read from the sensor. 
e Main function: 


/* Initialize debug UART */ 
uart_init(); 
This line initializes the UART peripheral, allowing us to print the sensor value: 
/* Initialize ADC */ 
joel evoke) alialalic (()) 7 
This line initializes the ADC: 
/* Start conversion */ 


start _conversion() ; 


This line starts the ADC conversion process. 
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¢ Infinite loop: 


sensor value = adc_read()j; 


This line reads the latest converted value from the ADC and stores it in the sensor __ 
value variable: 


printf ("Sensor Value: %d\r\n", sensor value) ; 


This line prints the sensor value to the terminal or console using the UART. The \r\n part at 
the end of the string ensures that the printed value starts on a new line each time. 


We are now ready to test the project. 
Testing the project 


To test your project, you must connect your sensor or a potentiometer to the development board. 
Follow these steps: 


1. Connect a sensor: 


" Signal pin: Connect the signal pin of your sensor to PAI. 
* GND pin: Connect the GND pin of the sensor to one of the GND pins on the development board. 


* VCC pin: Connect the VCC pin to either the 3.3V or 5V pin on the development board. 
Ensure you verify the required voltage from your sensor’s documentation as different sensors 
may need either 3.3V or 5V. 


2. Use a potentiometer: 


* Ifa sensor is not available, you can use a potentiometer instead. A potentiometer is an 
adjustable resistor that’s used to vary the voltage. It has three terminals: two fixed and one 
variable (wiper). 


* Middle terminal: Connect the middle terminal (wiper) of the potentiometer to PAI. 
" Left terminal: Connect the left terminal to 3.3V. 


" Right terminal: Connect the right terminal to GND. 
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See Figure 11.4 for the connection diagram: 
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Figure 11.5: Potentiometer connection 


As you turn the knob of the potentiometer, the resistance between the middle terminal and 
the fixed terminals (3.3V and GND) will change, which, in turn, changes the voltage output at 
the middle terminal. This varying voltage will be measured by the ADC. 


3. Run the project: 


* Build and run the project on the development board. 


* Open RealTerm or another serial terminal program and select the appropriate port and 
baud rate. 


* You should see the sensor values being printed in real time on the terminal. As you turn the 
potentiometer knob, the displayed value should change, reflecting the varying output voltage. 


Summary 


Summary 


In this chapter, we explored the ADC, a vital peripheral in embedded systems that enables microcontrollers 
to interface with the analog world. We started with an overview of the analog-to-digital conversion 
process, highlighting its importance and discussing key specifications such as resolution, step size, 
and VREF 


Then, we delved into the STM32F411 microcontrollers ADC peripheral, examining its capabilities and 
the relevant registers required for ADC operations. This included an overview of key ADC registers, 
such as ADC_CR1, ADC_CR2, ADC_SQRx, ADC_SR, and ADC_DR, as well as important ADC flags, 
such as EOC, JEOC, AWD, OVR, and STRT. 


This chapter also explained the different ADC modes, including single conversion mode, continuous 
conversion mode, scan mode, discontinuous mode, and injected conversion mode. Each mode was 
explained with practical use cases to illustrate their applications. 


Next, we examined how multiplexing allows the ADC to switch between multiple input signals, 
enabling the microcontroller to handle multiple analog inputs efficiently. 


Finally, we applied the theoretical concepts by developing a bare-metal ADC driver. This involved 
configuring a GPIO pin for ADC input, configuring the ADC module, starting conversions, and 
reading the ADC values. 


In the next chapter, we will focus on the Serial Peripheral Interface (SPI), another commonly used 
communication protocol known for its speed and efficiency in embedded systems. 
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In this chapter, we will learn about the Serial Peripheral Interface (SPI) protocol, another important 
communication protocol widely used in embedded systems. 


We will start by delving into the basics of the SPI protocol, understanding its master-slave architecture, 
data transfer modes, and typical use cases. Next, we will examine the key registers of the SPI peripheral 
in STM32 microcontrollers, providing detailed insights into their configuration and usage. Finally, we 
will apply this knowledge to develop a bare-metal SPI driver, demonstrating practical implementation 
and testing. 


In this chapter, we will cover the following main topics: 


e Overview of the SPI protocol 
e The STM32F4 SPI peripherals 
e Developing the SPI driver 


By the end of this chapter, you will have a good understanding of the SPI protocol and be equipped 
to develop bare-metal drivers for SPI. 


Technical requirements 


All code examples for this chapter can be found on GitHub at the following link: 


https://github.com/Packt Publishing/Bare-Metal -Embedded-C- Programming 


Overview of the SPI protocol 


Let’s dive into what SPI is, its key features, how it works, and some of the nuances that make it so powerful. 
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What is SPI? 


SPI is a synchronous serial communication protocol developed by Motorola. Unlike Universal 
Asynchronous Receiver-Transmitter (UART), which is asynchronous, SPI relies on a clock signal to 
synchronize data transfer between devices. It’s designed for short-distance communication (usually 
no more than 30 cm), primarily between a microcontroller and peripheral devices such as sensors, 
SD cards, and display modules. Let’s see its key features. 


Key features of SPI 


SPI stands out due to its efficiency. Here are some of its key features: 


Full-duplex communication: SPI supports simultaneous data transmission and reception 


High speed: SPI can operate at much higher speeds compared to protocols such as Inter- 
Integrated Circuit (I2C) and UART 


Master-slave architecture: One master device controls communication, while one or more 
slave devices respond 


Flexible data length: Can handle various data lengths, commonly 8 bits, but not limited to that 


To be able to connect two SPI devices, we must understand the SPI interface. 


The SPI interface 


SPI uses four primary lines for communication, each with several alternative names you might encounter: 


Master In Slave Out (MISO): Also known as Serial Data Out (SDO) or Data Out (DOUT), 
this line carries data from the slave device to the master 


Master Out Slave In (MOSI): Also known as Serial Data In (SDI) or Data In (DIN), this line 
carries data from the master device to the slave 


Serial Clock (SCK): Also referred to as SCLK (or simply CLK), this is the clock signal generated 
by the master to synchronize data transfer 


Slave Select (SS): Also known as Chip Select (CS) or Not Slave Select (NSS), this line is used 
by the master to select which slave device to communicate with 


When multiple slaves are used, each slave typically has its own SS line, allowing the master to control 
communication with each slave individually. Figure 12.1 illustrates the SPI connection between a 
single master and a single slave: 
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SPI ES EA SPI 


MASTER SLAVE 


Figure 12.1: The SPI interface 


Figure 12.2 depicts the SPI setup with a single master controlling multiple slaves: 


SBI SPI 
MASTER SLAVE 1 
SPI 
SLAVE 2 


Figure 12.2: The SPI interface - multiple slaves 
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When multiple slaves are connected to a single SPI bus, managing the MISO line is crucial to avoid 
communication errors. Since all slaves share this line, non-selected slaves could interfere with the 
signal from the selected slave if not properly controlled. To prevent such issues, several techniques 
are used. One common method is tri-state buffering, where each slave's MISO line enters a high- 
impedance (high-Z) state when its CS line is inactive, effectively disconnecting it from the bus. This 
ensures only the selected slave drives the MISO line, preventing bus contention. Another approach 
is the open-drain configuration with a pull-up resistor, where the MISO line is left floating (high- 
Z) when transmitting a 1 and pulled low by the selected slave when transmitting a 0. This reduces 
contention risks but may result in slower communication speeds due to the time delay introduced 
by the pull-up resistor. 


Let’s see how the SPI protocol works. 


How SPI works 


SPI works on a simple principle: the master generates a clock signal and selects a slave to communicate 
with by pulling the corresponding SS line low. Data is then exchanged simultaneously between the 
master and the slave over the MOSI and MISO lines. 


Here's a step-by-step breakdown: 


1. Initialization: The master sets the clock frequency and data format (for example, 8-bit data). 


2. Slave selection: The master pulls the SS line of the target slave low. In a multi-slave configuration, 
where each slave has its own CS line, the master first sets all CS lines high (inactive) before 
sending any initialization messages. This ensures that uninitialized slaves don't mistakenly respond 
to commands not intended for them. Once all CS lines are confirmed high, the master then 
activates the CS line of the desired slave by pulling it low to begin controlled communication. 


3. Data transmission: The master sends data to the slave on the MOSI line, while the slave sends 
data to the master on the MISO line. 


4. Clock synchronization: The master controls the clock, ensuring data is sampled and shifted 
at the correct times. 


5. Completion: Once the data transfer is complete, the master pulls the SS line high, deselecting 


the slave. 


To successfully implement an SPI driver, it’s essential to understand key SPI configuration parameters. 
Let’s explore them one by one, starting with Clock Phase (CPHA) and Clock Polarity (CPOL). 
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CPHA and CPOL 


In SPI communication, the settings for CPHA and CPOL determine the timing and characteristics of 
the clock signal used to synchronize data transfer between the master and slave devices. These settings 
are crucial for ensuring that data is correctly sampled and interpreted by both the master and the 
slave. Here’s a detailed look at how CPHA and CPOL affect SPI communication. 


CPOL 


CPOL determines the idle state of the clock signal (SCK). It controls whether the clock signal is high 
or low when no data is being transferred: 


e CPOL = 0: The clock signal is low (0) when idle. This means that the clock line remains low 
between data transmissions. 


e CPOL=1: The clock signal is high (1) when idle. This means that the clock line remains high 
between data transmissions. 


CPHA 


CPHA determines when data is sampled and when it is shifted out. It controls the edge of the clock 
signal on which data is read and written: 


e CPHA = 0: Data is sampled on the leading edge (first edge) of the clock pulse and shifted out 
on the trailing edge (second edge) 


e CPHA = 1: Data is shifted out on the leading edge (first edge) of the clock pulse and sampled 
on the trailing edge (second edge) 


The combination of CPOL and CPHA results in four different SPI modes, each affecting the timing 
of data sampling and shifting. 


Selecting the appropriate SPI mode is crucial for ensuring proper communication between the master 
and slave devices. Both devices must be configured to use the same CPOL and CPHA settings to 
correctly interpret the data being exchanged. The choice of mode depends on the specific requirements 
of the devices and the timing constraints of the application. Let’s move on to SPI data modes. 


Data modes 


SPI is flexible with the data length it can handle. While 8-bit data transfers are common, SPI can 
be configured to handle different data lengths, such as 16-bit or 32-bit transfers, depending on 
the application. The master and slave devices need to agree on the data length to ensure accurate 
communication. The last configuration parameter is the SPI speed. 
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SPI speed 


One of SPI’s significant advantages is its speed. SPI can operate at very high frequencies, typically up 
to several tens of MHz, depending on the hardware capabilities of the master and slave devices. The 
actual speed used in an application depends on several factors: 


e 


Device capabilities: The maximum speed supported by both the master and the slave 


Signal integrity: Higher speeds can lead to signal integrity issues such as crosstalk and reflections, 
especially over longer distances 


Power consumption: Higher speeds consume more power, which might be a consideration 
in battery-powered applications 


This concludes our overview of the SPI protocol. In the next section, we will analyze the SPI peripheral 
in the STM32F4 microcontroller. 


The STM32F4 SPI peripherals 


As with other peripherals, STM32 microcontrollers often include several SPI peripherals; the number 
varies depending on the specific model. The STM32F411 microcontroller has five SPI peripherals, 
namely the following: 


SPI1 
SPI2 
SPI3 
SPI4 
SPI5 


Key features 


Here are some of the key features: 


Full-duplex and half-duplex communication: Supports simultaneous two-way communication 
(full-duplex) or one-way communication (half-duplex) 


Master/slave configuration: Each SPI peripheral can be configured as either a master or a 
slave device 


Flexible data size: Supports data sizes ranging from 4 to 16 bits 


High-speed communication: Capable of operating at speeds up to 42 MHz in master mode 
and up to 21 MHz in slave mode 


The STM32F4 SPI peripherals 


Direct Memory Access (DMA) support: DMA support for efficient data transfer without 
CPU intervention 


Negative SS (NSS) pin management: Hardware management of the NSS pin for 
multi-slave configurations 


Cyclic Redundancy Check (CRC) calculation: Built-in hardware CRC calculation for data 
integrity verification 


Bidirectional mode: Supports bidirectional data mode, allowing a single data line to be used 
for both sending and receiving data 


Let’s examine the key registers of this peripheral. 


Key SPI registers 


To get SPI up and running on the STM32F411 microcontroller, we need to configure several registers 
that control various aspects of the SPI peripheral. Let’s break down the main registers we'll be working 
with, starting with the Control Register 1 register. 


SPI Control Register 1 (SPI_CR1) 


The SPI_CR1 register is central to configuring the SPI peripheral. It includes settings that define 
the SPI mode, data format, clock settings, and more. Key bits in this register include the following: 


CPHA: This bit determines on which clock edge the data is sampled. Setting CPHA to 0 means 
data is sampled on the first edge (leading edge), while setting it to 1 means data is sampled on 
the second edge (trailing edge). 


CPOL: This bit sets the idle state of the clock line. Setting CPOL to 0 means the clock is low 
when idle, and setting it to 1 means the clock is high when idle. 


Master Selection (MSTR): This bit configures the SPI peripheral as either a master or a slave. 
Setting MSTR to 1 makes the SPI peripheral a master, while 0 sets it as a slave. 


Baud Rate Control (BR[2:0]): These bits configure the baud rate for SPI communication. 


SPI Enable (SPE): This bit enables the SPI peripheral. We need to set SPE to 1 to activate 
SPI communication. 


Frame Format (LSBFIRST): This bit determines the bit order in data transmission. Setting 
LSBFIRST to 0 transmits the most significant bit (MSB) first, while 1 transmits the least 
significant bit (LSB) first. 


SS Internal (SSI): This bit is used in master mode to internally control the SS line. 


Software Slave Management (SSM): Setting this bit to 1 enables software management of the 
SS line, allowing the master to control it manually. 
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Next, we have the SPI Status Register. 
SPI Status Register (SPI_SR) 


The SPI_SR register provides real-time status updates on the SPI peripheral, informing us about 
various operational states and flags. Key bits in this register include the following: 


e Receive Buffer Not Empty (RXNE): This flag indicates that the receive buffer contains unread data 


¢ Transmit Buffer Empty (TXE): This flag signals that the transmit buffer is empty and ready 
for new data 


e CRC Error Flag (CRCERR): This flag is set when a CRC error is detected, indicating possible 
data corruption 


e Mode Fault (MODF): This flag signals a mode fault, often due to incorrect master/slave configuration 


e Overrun Flag (OVR): This flag indicates an overrun condition, where the receive buffer wasn't 
read in time 


e Busy Flag (BSY): This flag indicates that the SPI peripheral is currently engaged in a transmission 
or reception 


The last key register is the Data Register. 
SPI Data Register (SPI_DR) 


The SPI_DR register is the conduit for data transmission and reception. It’s where we write data to 
be sent out and read data that’s been received: 


e Transmitting data: When we write to the SPI_DR register, data is sent out over the MOST line 


e Receiving data: When we read from the SPI_DR, you get the data that was received on the 
MISO line 


With these registers in mind, were now ready to develop the SPI driver. Lets jump into that in the 
next section. 


Developing the SPI driver 


Create a copy of your previous project in your IDE and rename this copied project to SPI. Next, 
create a new file named spi .c in the Src folder and another file named spi .h in the Inc folder. 
Populate your spi . c file with the following code: 


#include "spi.h" 

#define SPI1EN (Alicea) 
#define GPIOAEN (1U<<0) 
#define SR TXE (1U<<1) 
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#define SR_RXNE 
#define SR_BSY 
void spi gpio init (void) 


{ 


/*Enable clock access to GPIOA*/ 


(1U<<0) 
(1U<<7) 


RCC- >AHB1ENR le GPIOAEN ; 


/*Set PA5,PA6,PA7 mode to alternate function*/ 


/*PAS*/ 


GPIOA- >MODER 
GPIOA- >MODER 


/*PA6* / 


GPIOA- >MODER 


GPIOA- > 


/*PAT*/ 
GPIOA- > 
GPIOA-> 


MODE 


MODE 
MODE 


R 


R 
R 


/*Set PA9 as 
GPIOA->MODER 
GPIOA->MODER 


&=~ (1U<<10); 
l= (Iuesii) y; 
&=~ (1U<<12); 
|= Guel) 5 
&=~ (1U<<14) ; 
EGUE 


output pin*/ 
|=(1U<<18) ; 
&=~ (1U<<19); 


/*Set PAS5,PA6,PA7 alternate function type to SPI1*/ 


/*PAS*/ 


GPIOA->AFR[ 
GPIOA->AFR[ 
GPIOA->AFR[ 
GPIOA->AFR[ 


/*PA6* / 


GPIOA->AFR[ 
GPIOA- >AFR[ 
GPIOA->AFR[ 
GPIOA->AFR[ 


/*PAT*/ 


GPIOA->AFR[ 
GPIOA->AFR[ 


0] 
0 
0 
0 


|=(1U<<20) ; 
&= ~(1U<<21); 
|=(1U<<22) ; 
&= ~(1U<<23); 
|=(1U<<24) ; 
&= ~(1U<<25); 
|=(1U<<26) ; 
&= ~(1U<<27) ; 
|=(1U<<28) ; 
&= ~(1U<<29); 
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GPIOA->AFR[0] |=(1U<<30) ; 
GPIOA->AFR[0] &= ~(1U<<31) ; 


} 


Next, we have the function for configuring the SPI parameters: 


void spil_ config (void) 
/*Enable clock access to SPI1 module*/ 
RCC- >APB2ENR |= SPI1EN; 


/*Set clock to £PCLK/4*/ 
SPI1->CR1 |=(1U<<3) ; 
SPI1->CR1 &=~(1U<<4) ; 
SPI1->CR1 &=~(1U<<5); 


/*Set CPOL to 1 and CPHA to 1*/ 
SPI1->CR1 |=(1U<<0) ; 
SPI1->CR1 |=(1U<<1) ; 


/*Enable full duplex*/ 
SPI1->CR1 &=~(1U<<10); 


/*Set MSB first*/ 
SPI1->CR1 &= ~(1U<<7); 


/*Set mode to MASTER*/ 
SPI1->CR1 |= (1U<<2); 


/*Set 8 bit data mode*/ 
SPI1->CR1 &= ~(1U<<11); 


/*Select software slave management by 
* setting SSM=1 and SSI=1*/ 

SPI1->CR1 |= (1<<8); 

SPI1->CR1 |= (1<<9); 


/*Enable SPI module*/ 
SPI1->CR1 |= (1<<6); 
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void spil transmit (uint8 t *data,uint32_t size) 


{ 


} 


uint 
uint 
whil 


{ 


32 © da0; 
SAG TEMIR 


e(i<size) 


/*Wait until TXE is set*/ 


while(!(SPI1->SR & (SR_TXE))){} 


} 
/*Wa 
whil 


/*Wa 
whil 


VEe 
temp 
temp 


/*Write the data to the data register*/ 


SPI1->DR = data[il; 


i++; 


it until TXE is set*/ 
e(!(SPI1->SR & (SR_TXE))) {} 


it for BUSY flag to reset*/ 
e((SPI1->SR & (SR_BSY))) {} 
ear OVR flag*/ 

ZRS PTIT >DE, 

ZRS PTIT ZSR; 


Here is the function for receiving data: 


void spil receive(uint8 t *data,uint32_t size) 


{ 


whil 


{ 


e(size) 


/*Send dummy data*/ 

SPlil=sDR =0); 

/*Wait for RXNE flag to be set*/ 
while(!(SPI1->SR & (SR_RXNE))) {} 
/*Read data from data register*/ 
*data++ = (SPI1->DR); 


size--; 
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Finally, we have the functions for controlling the CS pin: 


void cs_enable (void) 


{ 
} 


GPIOA->ODR &=~(1U<<9) ; 


And then, the function for deselecting the slave: 


/*Pull high to disable*/ 
void cs disable (void) 


{ 
} 


GPIOA->ODR |=(1U<<9); 


Let’s walk through each part of the SPI initialization and communication code. We'll start by looking 
at the defined macros and then dive into each function. 


Defined macros 


Let’s break down the meaning of the macros and their functions: 


#define SPI1EN (1U<<12) 
#define GPIOAEN (1U<<0) 
#define SR_TXE (aha) 
#define SR_RXNE (1U<<0) 
#define SR_BSY (1U<<7) 


Over here, we see the following: 


e SPI1EN: This is defined as (1U<<12), which sets bit 12. It’s used to enable the clock for the 
SPI1 peripheral. 


e GPIOAEN: This is defined as (1U<<0), which sets bit 0. This enables the clock for GPIOA. 
e SR_TXE: This is defined as (1U<<1). This indicates that the transmit buffer is empty. 
e SR_RXNE: This is defined as (1U<<0). This indicates that the receive buffer is not empty. 


e SR_BSY: This is defined as (1U<<7). This indicates that the SPI interface is busy with a transfer. 


Let’s break down the initialization function. 
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GPIO initialization for SPI 
Let’s analyze the configuration of the SPI1 GPIO pins: 


RCC->AHB1ENR |= GPIOAEN; 


This line enables the clock for GPIOA by setting the appropriate bit in the AHB1 peripheral clock 
enable register: 


/*PAS*/ 

GPIOA->MODER &=~ (1U<<10) ; 
GPIOA->MODER |=(1U<<11) ; 
/*PA6*/ 

GPIOA->MODER &=~ (1U<<12) ; 
GPIOA- >MODER |S(tieeil3})) 3 
/*PAT*/ 

GPIOA->MODER &=~ (1U<<14) ; 
GPIOA- >MODER |S(Clttiget)) 5 


These lines configure PA5, PA6, and PA7 pins to alternate function modes, necessary for SPI: 


GPIOA- >MODER le (us< 18) 
GPIOA->MODER &= ~(1U<<19); 


This configures PA9 as a general-purpose output pin, which will be used for SS: 


/*PA5*/ 

GPIOA->AFR[0] |=(1U<<20); 
GPIOA->AFR[0] &= ~(1U<<21); 
GPIOA->AFR[0] |=(1U<<22); 
GPIOA->AFR[0] &= ~(1U<<23); 
/*PA6*/ 

GPIOA->AFR[0] |=(1U<<24); 
GPIOA->AFR[0] &= ~(1U<<25); 
GPIOA->AFR[0] |=(1U<<26); 
GPIOA->AFR[0] &= ~(1U<<27); 
/*PA7*/ 

GPIOA->AFR[0] |=(1U<<28); 
GPIOA->AFR[0] &= ~(1U<<29); 
GPIOA->AFR[0] |=(1U<<30); 
GPIOA->AFR[0] &= ~(1U<<31); 
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These lines set the alternate function registers to configure PA5, PA6, and PA7 for SPT1. 


SPI1 configuration 
Next, we have the code for configuring the SPI parameters: 


RCC- >APB2ENR |= SPI1EN; 


This line enables the clock for SPI1 by setting the appropriate bit in the APB2 peripheral clock 
enable register: 


SPI1->CR1 |=(1U<<3) ; 
SPI1->CR1 &=~(1U<<4) ; 
SPI1->CR1 &=~(1U<<5) 


1 


These lines configure the SPI clock prescaler to set the baud rate by dividing the APB2 peripheral 
clock by 4, as SPI1 is connected to the APB2 bus. The baud rate is determined by the Baud Rate (BR) 
bits in the SPI Control Register (SPI1->CR1). In this case, setting the BR bits (bit 5 to bit 3) to 001 
results in the peripheral clock being divided by 4, which dictates the speed at which data is transferred 
over the SPI bus: 


SPI1->CR1 |=(1U<<0); 
SPI1->CR1 |=(1U<<1) ; 


These lines set the clock polarity and phase to ensure correct data sampling: 
SPI1->CR1 &=~(1U<<10) ; 
This line ensures that full-duplex mode is enabled for simultaneous transmit and receive: 
SPIEL T ERI (alka 7/)) p 
This line configures SPI to transmit the MSB first: 
SPI1->CR1 |= (1U<<2); 
This line sets SPI1 to master mode, making it the controller of the SPI bus: 
SWISS, fa (al Ukere Lab) F 
This line configures the SPI data frame size to 8 bits: 


SPI1->CR1 |= (1<<8); 
SPI1->CR1 |= (1<<9); 


Developing the SPI driver 


These lines enable software management of the SS line. 


SPI1->CR1 |= (1<<6); 


This line enables the SPI peripheral for operation: 


Let’s move on to the spil_transmit () function. 


Transmitting data with SPI 

This snippet deals with transmitting the data: 
while (!(SPI1->SR & (SR_TXE))) {} 

This loop waits until the transmit buffer is empty before sending the next byte: 
SPI1->DR = data[il]; 

This line sends the current byte of data: 
while ((SPI1->SR & (SR_BSY))) {} 


This ensures the SPI bus is not busy before continuing: 


temp 


SPI1->DR; 
SPI1->SR; 


temp 


These two lines play a crucial role in managing the SPI communication process. After the master 
transmits data through the SPI Data Register, the same register captures the data received from the 
slave. To ensure incoming data is properly processed, we read the Data Register, even if we don't need 
the value. This read operation automatically clears the OVR flag. It’s also advisable to read the Status 
Register as part of this process. 


Next, we have the spil_ receive () function. 
SPI data reception 


This deals with receiving the data: 


SMS = Op 
This line sends dummy data to generate clock pulses: 
while (!(SPI1->SR & (SR_RXNE))) {} 
This line waits until data is received: 


*data++ = (SPI1->DR) ; 
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This line reads the received data: 


The last functions are the cs_enable() andcs_ disable () functions. 


CS management 
This line pulls the SS line low to enable the slave device: 


GPIOA->ODR &= ~(1U << 9); 


This line pulls the SS line high to disable the slave device: 


GPIOA->ODR |= (1U << 9); 


Our next task is to populate the spi . h file. 


The header file 


Here is the code: 


#ifndef SPI H_ 
#define SPI H_ 
#include "stm32f4xx.h" 
#include <stdint.h> 


void spi _ gpio init (void); 

void spil_ config(void) ; 

void spil transmit (uint8 t *data,uint32 t size); 
void spil receive(uint8 t *data,uint32 t size); 
void cs_enable (void); 

void cs_disable (void); 


#endif 


Over here, we are simply exposing the functions to make them accessible in other files. 
To effectively test the SPI driver, we need a suitable slave device. In the next section, we'll dive into the 
ADXL345 accelerometer, which we'll use as our slave device to test the SPI driver. 


Getting to know the ADXL345 accelerometer 


ADXL345 is a gem in the world of digital accelerometers, and it’s perfect for testing our SPI module. 
Let’s dive into what makes this device so special and how it fits into our embedded system projects. 
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What is ADXL345? 


ADXL345 is a small, thin, ultralow power, 3-axis accelerometer that can measure static acceleration 
of gravity in tilt-sensing applications, as well as dynamic acceleration resulting from motion or 
shock. It’s versatile, highly accurate, and can handle a variety of tasks with ease. This accelerometer 
offers high-resolution (up to 13-bit) measurements with a selectable measurement range of +2 g, +4 
g +8 g, or +16 g: 


Figure 12.3: The ADXL345 


Let’s analyze its key features. 


Key features of the ADXL345 


Following is a list of the ADXL345’s features: 


Ultralow power: The device consumes as little as 23 uA in measurement mode and just 0.1 yA 
in standby mode, making it ideal for battery-powered applications. 


User-selectable resolution: We can choose a resolution from 10 to 13 bits, providing a scale 
factor of 4 mg/LSB across all g ranges. 


Flexible interface: The ADXL345 supports both SPI (3- and 4-wire) and I2C digital interfaces, 
giving us flexibility in how you integrate it into your system. 


Special sensing functions: It includes single tap, double tap, and free-fall detection, along with 
activity/inactivity monitoring. These functions can be individually mapped to two interrupt 
output pins, making it highly responsive to physical events. 


Wide supply voltage range: It operates from 2.0 V to 3.6 V, accommodating various 
power configurations. 


Robust performance: The ADXL345 can withstand a shock of up to 10,000 g, ensuring durability 
in rugged applications. 


Let’s see some of its common applications. 
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Applications 
Given its robust feature set, the ADXL345 is well suited for a range of applications: 
e Industrial equipment: For machinery monitoring and fault detection 
e Aerospace equipment: In systems where reliability and precision are paramount 


e Consumer electronics: Examples are smartphones, gaming devices, and wearable technology 


e Health and sports: For tracking motion and activity in health monitoring devices 
Let’s take a closer look at its sensing function. 
Sensing function 


At its core, the ADXL345 measures acceleration along three axes: x, y, and z. The data is available in 
a 16-bit two's complement format and can be accessed via either the SPI or I2C interface. 


The following are its sensing functions: 


e Activity and inactivity monitoring: The accelerometer can detect movement or the absence 
thereof, making it great for sleep monitoring and fitness applications 


e Tap detection: It can recognize single and double taps in any direction, which is useful for 
gesture-based controls 


e Free-fall detection: The device can detect if it’s in free fall, which can be used in safety systems 
to trigger an alert or a response 


Figure 12.4 shows the x, y, and z axes: 


Figure 12.4: The x, y, and z axes 
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The ADXL345 also offers various low-power modes to help manage power consumption intelligently. 
These modes allow the device to enter sleep or standby states based on our defined thresholds and 
activity levels. 


It also includes a 32-level FIFO buffer, which helps in storing data temporarily to reduce the load on 
the host processor. This buffer is especially useful in applications requiring high data throughput or 
when the processor is busy with other tasks. Lastly, its pinout is straightforward: 


e VDD I/O: Digital interface supply voltage 

e GND: Ground 

e CS: CS for SPI communication 

e INTI and INT2: Interrupt output pins 

e SDA/SDI/SDIO: Serial data line for 12C or SPI input 
e SCL/SCLK: Serial clock line for 12C or SPI 


Before we dive into developing the driver for this slave device, lets first explore some key concepts of 
acceleration measurement. 


Understanding key concepts - static acceleration of gravity, tilt- 
sensing, and dynamic acceleration 


When working with accelerometers such as the ADXL345, it’s important to grasp some fundamental 
concepts that underpin their operation and applications. Let’s break down what static acceleration of 
gravity, tilt-sensing, and dynamic acceleration mean. 


Static acceleration of gravity 


Static acceleration of gravity refers to the constant acceleration due to gravity that acts on an object at 
rest. This acceleration is always present and has a magnitude of approximately 9.8 meters per second 
squared (m/s?) on the surface of the Earth. 


In the context of an accelerometer such as the ADXL345, static acceleration is used to determine 
the orientation of the device. When the accelerometer is at rest and positioned flat, it measures the 
static acceleration of gravity along the z axis, which helps to identify which direction is “down.” This 
capability is crucial for applications such as the following: 


e Orientation detection: Determining the device's orientation relative to the Earth’s surface 


e Tilt-sensing: Measuring the tilt angle of the device by observing how gravity’s force changes 
across different axes 


The next important concept is tilt-sensing. 
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Tilt-sensing 


Tilt-sensing is the process of measuring the angle at which an object is tilted with respect to the 
force of gravity. This is achieved by analyzing the static acceleration readings from the accelerometer. 


Imagine holding a tablet. When you tilt it forward, backward, or sideways, the accelerometer inside 
detects changes in the static acceleration along its x, y, and z axes. By comparing these changes, the 
device can calculate the tilt angle. Heres how it works: 


e X-axis tilt: If the device is tilted along the x axis, the static acceleration detected on the x axis 
will increase or decrease depending on the direction of the tilt. 


e Y-axis tilt: Similarly, tilting along the y axis will cause variations in the static acceleration 
readings on the y axis. 


e Z-axis stability: The z axis usually detects the full force of gravity when the device is lying flat. 
Changes in tilt cause redistributions of this force among the x and y axes. 


Tilt-sensing is widely used in applications such as the following: 


e Screen orientation: Automatically adjusting the display from portrait to landscape mode 
e Gaming controllers: Detecting movements and tilts to enhance gameplay 


¢ Industrial equipment: Monitoring the tilt of machinery or vehicles for stability and safety 
The final key concept is dynamic acceleration. 
Dynamic acceleration 


Dynamic acceleration refers to the acceleration that results from motion or external forces acting on 
the device. Unlike static acceleration, which is constant, dynamic acceleration varies based on how 
the device is moving. 


For instance, if you shake or move the accelerometer, it measures these changes as dynamic acceleration. 
This type of acceleration is crucial for the following: 


e Motion detection: Identifying when the device is moved, which can be used in fitness trackers 
to count steps 


e Shock or impact sensing: Detecting sudden impacts or vibrations, useful in crash detection 
systems or drop tests 


e Vibration monitoring: Measuring vibrations in industrial machinery to predict failures or 
maintenance needs 


«> 


Before wrapping up this section, let’s clarify one more concept we introduced earlier: “g: 
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When dealing with accelerometers such as the ADXL345, you often come across terms such as +2 
g +4 g, +8 g, or +16 g. These terms are crucial for understanding the measurement capabilities and 
limits of the device. Let’s break down what g means and how these ranges affect the performance and 
application of an accelerometer. 


What is g? 


The term g refers to the acceleration due to gravity at the Earth's surface, which is approximately 9.8 
meters per second squared (m/s’). It is used as a unit of measurement for acceleration. When we 
say an accelerometer can measure +2 g, it means it can detect accelerations up to twice the force of 
gravity in either direction along an axis. 


With this clarified, we are now ready to develop the driver for the ADXL345 device. 


Developing the ADXL345 driver 


Create a new file named adx1345.cin the Src folder and another file named adx1345 .h in the 
Inc folder. 


The header file 
Populate the adx1345 . h file with this: 


#ifndef ADXL345_ H_ 
#define ADXL345_ H_ 
#include "spi.h" 


#include <stdint.h> 


#define ADXL345 REG DEVID (0x00) 
#define ADXL345 REG DATA FORMAT (0x31) 
#define ADXL345 REG POWER_CTL (0x2D) 
#define ADXL345 REG DATA START (0x32) 
#define ADXL345 RANGE 4G (0x01) 
#define ADXL345 RESET (0x00) 
#define ADXL345 MEASURE BIT (0x08) 
#define ADXL345 MULTI BYTE ENABLE (0x40) 
#define ADXL345 READ OPERATION (0x80) 


void adxl_ init (void); 
Wepkel evcbiell retsercl(ipaligiet) ie tevclehasisisi, ‘iialinieg) ic? sexgkieet) F 
#endif 
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The adx1345 .h file begins by including our SPI driver with #include "spi.h" and proceeds 
to define the necessary macros. Let’s break down the macros: 


ADXL345 REG DEVID (0x00): This macro defines the register address for the device ID 
of the ADXL345 


ADXL345 REG DATA FORMAT (0x31): This macro defines the register address for setting 
the data format of the ADXL345 


ADXL345 REG POWER CTL (0x2D): This macro defines the register address for the 
power control settings of the ADXL345 


ADXL345 REG DATA START (0x32): This macro defines the starting register address 
for reading acceleration data from the ADXL345 


ADXL345 RANGE 4G (0x01): This macro defines the value to set the measurement range 
of the ADXL345 to +4g 


ADXL345 RESET (0x00): This macro defines the reset value for certain registers 


ADXL345 MEASURE BIT (0x08): This macro defines the bit value to enable measurement 
mode in the power control register 


ADXL345 MULTI BYTE ENABLE (0x40): This macro defines the bit to enable 
multi-byte operations 


ADXL345 READ OPERATION (0x80): This macro defines the bit to specify a read operation 


Next, we populate the adx1345 . c file: 


#include "adx1345.h" 


void adxl read(uint8 t address, uint8 t * rxdata) 


{ 


/*Set read operation*/ 
address |= ADXL345 READ OPERATION; 


/*Enable multi-byte*/ 
address |= ADXL345 MULTI BYTE ENABLE; 


/*Pull cs line low to enable slave*/ 
esmenabilel (i, 


/*Send address*/ 
spil transmit (&address,1) ; 
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/*Read 6 bytes */ 
spil_ receive (rxdata, 6) ; 


/*Pull es line high to disable slave*/ 
cs disable() ; 


void adxl write (uint8 _t address, uint8 t value) 


{ 


uints t data[2]; 


/*Enable multi-byte, place address into buffer*/ 
data[0] = address|ADXL345 MULTI_BYTE_ ENABLE; 


/*Place data into buffer*/ 
data[1] = value; 


/*Pull cs line low to enable slave*/ 
cs_enable(); 


/*Transmit data and address*/ 
spil_ transmit (data, 2); 


/*Pull es line high to disable slave*/ 
ets_celijseiolle () 9 


void adxl init (void) 


{ 


/*Enable SPI gpio*/ 
spi_gpio init(); 


/*Config SPI*/ 
spil_config(); 


/*Set data format range to +-4g*/ 
adxl_write (ADXL345 REG DATA FORMAT, ADXL345 RANGE 4G) ; 


/*Reset all bits*/ 
adxl_ write (ADXL345 REG POWER CTL, ADXL345 RESET) ; 
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} 


/*Configure power control measure bit*/ 
adxl_write (ADXL345 REG POWER_CTL, ADXL345 MEASURE BIT) ; 


Let’s analyze the functions line by line, starting with the adx1_read() function. 


Function - adxl_read() 


Let’s break down the read function: 


address |= ADXL345 READ OPERATION;;: This line sets the MSB of the address to 
indicate a read operation 


address |= ADXL345 MULTI_BYTE_ENABLE;;: This sets the multi-byte bit to enable 
multi-byte operations 


cs_enable () ;: This function pulls the CS line low, enabling communication with the ADXL345 


spil_transmit (&address, 1) ;: This transmits the address (with read and multi-byte 
bits set) to the ADXL345 


spil_receive(rxdata, 6) ;: This line reads 6 bytes of data from the ADXL345 and 
stores it in the buffer pointed to by rxdata 


cs_disable () ;: This function pulls the CS line high, ending communication with the ADXL345 


Next, we have the adxl_write() function. 


Function - adxl_write 


Let’s go through each line of this function: 


data[0] = address | ADXL345 MULTI_BYTE_ENABLE;;: This sets the multi-byte 
bit and stores the modified address in the buffer 


data[1] = value;: This stores the data to be written in the buffer 
cs_enable () ;: This function pulls the CS line low, enabling communication with the ADXL345 


spil transmit (data, 2) ;: This transmits the address and data to the ADXL345 in 
one transaction 


cs_disable() ;: This function pulls the CS line high, ending communication with the ADXL345 


Finally, we have the adx1l_ init () function. 
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Function - adxl_init 


Let’s analyze the initialization function: 


spi_gpio init () ;: This function initializes the GPIO pins needed for SPI communication 
spil_config() ;: This function configures the SPI settings (clock speed, mode, etc.) 


adxl_ write (ADXL345 REG DATA FORMAT, ADXL345 RANGE 4G) ;: This line 
writes to the data format register to set the measurement range of the ADXL345 to +4g 


adxl_ write (ADXL345 REG POWER CTL, ADXL345 RESET) ;: This line writes to 
the power control register to reset all bits 


adxl_write (ADXL345 REG POWER CTL, ADXL345 MEASURE BIT) ;: This line 
writes to the power control register to set the measure bit, enabling measurement mode 


We are now ready to test the driver inside the main. c file. Update your main . file as shown next: 


#include <stdio.h> 
#include <stdint.h> 
#include "stm32f4xx.h" 
#include "uart.h" 
#include "adx1345.h" 


//Variables for storing accelerometer data 


intile te accel x, aeceliiy, accel; 


double accel x g, accel yg, accel zg; 


uint8 t data_buffer [6]; 


int main (void) 


{ 


tekae iep 


// Initialize the ADXL345 accelerometer 
adz inito 


while (1) 
// Read accelerometer data starting from the data start 
// register 
adxl_read(ADXL345 REG DATA START, data_buffer) ; 


// Combine high and low bytes to form the accelerometer data 
accelix = ((intlleme) (Care owira << 6) | data_buffer[0]); 


257 


258 Serial Peripheral Interface (SPI) 


accelily = (Gntle ve) ((datalburter [3] << 8) | caca owr rer [|Z] )) z 
accelliz, = (imele e) ((datalburter [5] z2 8) | data_buffer[4]); 


// Convert raw data to g values 
cceall x — accel x = 0 00767 
accel y g = accel y = 0.00737 
aceel z g = accel z = 0.00787 


//Print values for debugging purposes 
printf ("accel x : %d accel y : %d accel _z : %d\n\ 
ie"! ACCEL ss, ECCE Wy ECEE 4) 5 


return 0; 


} 


Lets break down the main () function: 


e accel_x, accel _y, accel_z: These are variables to store the raw accelerometer data 
for each axis. 


e accel x g, accel_y_g, accel_z_g: These are variables to store the converted 
accelerometer data in g units. 


e data_buffer [6]: This is a buffer to hold the raw data bytes read from the ADXL345. 
e adxl init (): This initializes the ADXL345 accelerometer. 


e adxl read(ADXL345 REG DATA START, data_buffer) ;: This line reads data 
from the ADXL345 starting at the specified register (ADXL345 REG DATA START). The 
data is stored in data_buffer. 


Finally, we have the lines for constructing the final 16-bit values: 


accel: = (Gntle le) (data butter [1]) << 6) | data_buffer[0]); 
BoC y = (alimesS ic) (Carta e rerlel az 2) | data_buffer[2]); 
OCS 4 = Gimel ic) ((clevcei_ljuneere|[5]] << 2) | data_buffer[4]); 


The data read from the ADXL345 is in 2 bytes (high and low) for each axis. These lines combine the 
bytes to form 16-bit values for each axis: 


acel oe g = accel oe = 000V; 
accolade = ceal y = 0.00767 
cceall z g = ecel z = 0-0076; 


Summary 


These lines convert the raw accelerometer values to g values: 


printf ("accel_x : $d accel_y : sd accel_z : %d\n\ 
is! ACCEL Ss, ACCC yac v4) y 


This line outputs the raw accelerometer data for debugging purposes: 


Now, let’s test the project. To test the project, compile the code and run it on your microcontroller. 
Open RealTerm or another serial terminal application and configure it with the appropriate port and 
baud rate to view the debug messages. Press the black pushbutton on the development board to reset 
the microcontroller. You should see the x, y, and z accelerometer values continuously being printed. 
Try moving the accelerometer to observe the values change significantly. 


Summary 


In this chapter, we explored the SPI protocol, a widely used communication protocol in embedded 
systems for efficient data transfer between microcontrollers and peripherals. We began by understanding 
the basic principles of SPI, including its master-slave architecture, data transfer modes, and typical 
use cases, emphasizing its advantages such as full-duplex communication and high-speed operation. 


Next, we examined the SPI peripheral in STM32F4 microcontrollers, focusing on critical registers 
such as SPI Control Register 1 (SPI_CR1), SPI Status Register (SPI_SR), and SPI Data Register 
(SPI_DR). We detailed how to configure these registers to set up the SPI peripheral for communication, 
covering important aspects such as clock polarity (CPOL) and clock phase (CPHA), data frame size, 
and master/slave configuration. 


We then applied this theoretical knowledge by developing a bare-metal SPI driver. The development 
process included initializing the SPI peripheral, implementing data transmission and reception functions, 
and handling CS management. We also integrated the SPI driver with an ADXL345 accelerometer, 
using SPI to communicate with the sensor and retrieve acceleration data. Finally, we tested the driver 
by reading and displaying the accelerometer data in real time. 


In the next chapter, we will explore the final of the three most common communication protocols in 
embedded systems: [2C. 
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In this chapter, we will learn about the Inter-Integrated Circuit (12C) communication protocol. We 
will begin by exploring the fundamental principles of the I2C protocol, covering its modes of operation, 
addressing methods, and the communication process. Then, we will examine the key registers of the 
12C peripheral in STM32 microcontrollers and apply this knowledge to develop a bare-metal I2C driver. 


In this chapter, we will cover the following main topics: 


e An overview of the I2C protocol 
e The STM32 I2C peripheral 
e Developing the I2C Driver 


By the end of this chapter, you will have a solid grasp of the I2C protocol and be equipped with the 
skills to develop bare-metal drivers for I2C. 
Technical requirements 


All the code examples for this chapter can be found on GitHub at https: //github.com/ 
Packt Publishing/Bare-Metal -Embedded-C- Programming. 


An overview of the I2C protocol 


I2C is another commonly used protocol. Let’s explore what it is, its key features, how it works, and 
its data format. 
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What is 12C? 


I2C is a multi-master, multi-slave, packet-switched, single-ended, serial communication bus invented 
by Philips Semiconductor (now NXP Semiconductors). Its designed for short-distance communication 
within a single device or between multiple devices on the same board. I2C is known for its simplicity 
and ease of use, making it a popular choice for communication between microcontrollers and other 
ICs. Let’s see its key features. 


The key features of 12C 


I2C has a number of unique features, which makes it ideal for various applications in embedded systems: 


A two-wire interface: I2C uses only two wires, Serial Data (SDA) and Serial Clock (SCL), 
which simplifies the wiring and reduces the number of pins required on the microcontroller. 


Multi-master and multi-slave: Multiple master devices can initiate communication on the bus, 
and multiple slave devices can respond. This flexibility allows for complex communication setups. 
The I2C protocol supports up to 128 devices with 7-bit addressing, although the practical limit 
is 119 due to reserved addresses. With 10-bit addressing, the protocol theoretically allows for 
1,024 devices, but again, reserved addresses reduce the practical maximum slightly. The 10-bit 
mode, while less common, enables a higher number of devices on the same bus. 


Addressable devices: Each device on the I2C bus has a unique address, enabling the master 
to communicate with specific slaves. 


Synchronous communication: The SCL line provides the clock signal, ensuring that data is 
transferred synchronously between devices. 


Speed variants: I2C supports various speed modes, including standard mode (100 kHz), 
fast mode (400 kHz), fast mode plus (1 MHz), and high-speed mode (3.4 MHz), catering to 
different speed requirements. 


Simple and low-cost: The protocol’s simplicity and minimal hardware requirements make it 
cost-effective and easy to implement. 


Let’s look at the I2C interface. 


The 12C interface 


The I2C interface consists of two main lines: 


Serial Data (SDA): This line carries the data being transferred between devices. It’s a bidirectional 
line, meaning that both the master and slave can send and receive data. 


Serial Clock (SCL): This line carries the clock signal generated by the master device. It 
synchronizes the data transfer between the master and the slave. 


An overview of the I2C protocol 


Figure 13.1: The 12C interface - multiple slaves 


These two lines are connected to all devices on the bus, with pull-up resistors to ensure that the lines 
are pulled to a high state when idle. Let’s see how it works. 


How 12C works 


Understanding how I2C works involves looking at the roles of master and slave devices, the addressing 
scheme, and the communication process. 


The following are the roles and addressing scheme: 


Master device: The master device initiates communication and generates a clock signal. It 
controls the flow of data and can address multiple slaves. 


Slave device: The slave device responds to the master’s commands and performs the requested 
operations. Each slave has a unique 7-bit or 10-bit address that the master uses to identify it. 


The communication process is as follows: 


l. 


A start condition: Communication begins with the master generating a start condition. This 
involves pulling the SDA line low while the SCL line is high. 


Address frame: The master sends the address of the target slave device, followed by a read/ 
write bit indicating the operation type (0 for write and 1 for read). 


Acknowledge (ACK) bit: The addressed slave responds with an ACK bit by pulling the SDA 
line low during the next clock pulse. 


Data Frames: Data is transferred in 8-bit frames. Each byte is followed by an ACK bit from 
the receiver. 
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5. A stop condition: The master ends the communication by generating a stop condition, which 
involves pulling the SDA line high while the SCL line is high. 


Start Address Frame R/W ACK Data Frame ACK Stop 


1 bit 7 to 10 bits 1bit bit 8 bits 1 bit 8 bits 1 bit 1 bit 


Figure 13.2: The |2C packet 


Before we proceed to the next section, lets take a moment to touch on the I2C data transfer, using 
an example. 


Let’s begin by revisiting the role of the data frame and the start condition: 


e Data frames: Data is transferred in 8-bit bytes. After each byte, the receiver sends an ACK bit 
to confirm successful reception. 


e Repeated start condition: If the master needs to communicate with another slave or continue 
communication without releasing the bus, it can generate a repeated start condition instead 
of a stop condition. 


Let’s see the data transfer: 


e Write operation: The master sends a start condition, the address frame with the write bit, 
and the data frames. Each data byte is followed by an ACK bit from the slave. 


e Read operation: The master sends a start condition, the address frame with the read bit, and 
then reads the data frames from the slave. Each data byte is acknowledged by the master with 
an ACK bit, except for the last byte, which is followed by a NACK to indicate the end of the 
read operation. 


For a better understanding, let’s analyze Figures 13.3 to 13.6, starting with the start and stop conditions. 
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Figure 13.3: The start condition 
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The following is the stop condition: 
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Figure 13.4: The stop condition 


The start condition occurs when the master device pulls the SDA line low while the SCL line remains 
high. This sequence signals all devices on the I2C bus that a communication session is about to begin, 
allowing the master to claim the bus for its intended operations. Without a valid start condition, the 
PC communication cannot commence. 


Conversely, the stop condition signals the end of communication. The master device releases the SDA 
line to a high state while the SCL line is high, indicating that the communication session is complete 
and the bus is now free for other devices. The proper use of stop conditions is essential for ensuring 
that no devices remain active on the bus, which could lead to conflicts or communication errors. 


Next, let’s see how the I2C protocol distinguishes between zeros and ones. 


SDA 


SCL 


1 0 


Figure 13.5: The data transmission process 
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Figure 13.5 illustrates the data transmission process. Data is sent bit by bit, synchronized with the 
clock pulses on the SCL line. As shown, each bit of data is placed on the SDA line while the SCL line 
is low. When the SCL line transitions to high, the state of the SDA line is read by the receiving device. 
This particular figure shows the transmission of 1, followed by a 0. 


Finally, let’s examine the complete packet and how it interacts with the SDA and SCL lines. 


Address Data 


SCL 


START STOP 


Figure 13.6: The complete packet 


Figure 13.6 provides a comprehensive view of a complete I?C communication packet, showcasing the 
relationship between the SDA and SCL lines throughout the transaction. The communication begins 
with a start condition, where the SDA line is pulled low while the SCL line remains high, signaling 
the initiation of a new communication sequence. 


Following the start condition, the address frame is transmitted. This frame contains the 7-bit address 
of the target device, followed by the read/write (R/W) bit that indicates whether the master intends 
to read from or write to the slave device. The address frame is then acknowledged by the slave device 
with an ACK bit, confirming that it is ready to proceed with the communication. 


After the address frame, the data frame is transmitted. The data is sent in 8-bit bytes, with each bit 
being placed on the SDA line while the SCL line clocks each bit in sync. After each byte of data, the 
receiving device responds with another ACK bit, ensuring that the data was received correctly. 


The communication concludes with a stop condition, where the SDA line is released to go high while 
the SCL line is also high. This signals the end of the communication session, freeing the bus for other 
potential communications. This complete cycle, from start to stop, forms the backbone of data exchange 
in the I’C protocol, ensuring structured and reliable communication between devices. 


The STM32F4 I2C peripherals 


This concludes our overview of the I2C protocol. In the next section, we shall analyze the I2C peripheral 
in the STM32F4 microcontroller. 


The STM32F4 I2C peripherals 


Depending on the specific model of STM32F4 you are working with, you can typically find up to 
three I2C peripherals labeled I2C1, 12C2, and 12C3. These peripherals enable the microcontroller to 
communicate with I2C-compatible devices using the standard two-wire interface. 


The I2C peripherals in STM32F4 microcontrollers come packed with features that enhance their 
versatility and performance: 


e Multi-master and multi-slave capabilities: Each I2C peripheral can operate as both master 
and slave, supporting multiple master configurations where more than one master device can 
control the bus 


e Standard, fast, and fast mode plus: The peripherals support multiple speed modes, including 
standard mode (100 kHz), fast mode (400 kHz), and fast mode plus (1 MHz), allowing for 
flexibility in communication speed 


e 10-bit addressing: In addition to standard 7-bit addressing, the I2C peripherals also support 
10-bit addressing, enabling communication with a broader range of devices 


e Dual addressing mode: Each I2C peripheral can be configured to respond to two different 
addresses, useful for complex multi-device setups 


e DMA support: Direct Memory Access (DMA) support is available, enabling efficient data 
transfer without CPU intervention 


Let’s examine the key registers of this peripheral. 


The key I2C registers 


Configuring the I2C peripheral on an STM32 microcontroller involves several key registers that 
control various aspects of its operation. 


Each register has specific bits that need to be set correctly to ensure proper functionality. Let’s break 
down the main registers we'll be working with, starting with Control Register 1. 


12C Control Register 1 (12C_CR1) 


I2C_CR1 is one of the primary control registers used to configure the I2C peripheral’s basic operational 
settings. It provides options to enable the peripheral, manage the start and stop conditions, and 
control the acknowledge feature. 
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The key bits in this register include the following: 


e Peripheral enable (PE): This bit enables or disables the I2C peripheral. Setting this bit to 1 
turns on the I2C peripheral, while clearing it turns it off. 


e Start generation (START): Setting this bit generates a START condition, initiating communication. 
e Stop generation (STOP): Setting this bit generates a STOP condition, terminating communication. 
e Acknowledge enable (ACK): When set, this bit enables the ACK after each byte received. 

e Acknowledge/PEC position (POS): This bit controls the position of the ACK bit. 

e Software reset (SWRST): Setting this bit resets the I2C peripheral. 


You can find detailed information about this register on page 492 of the STM32F4 reference manual 
(RM0383). Next, lets look at I2C Control Register 2. 


12C Control Register 2 (12C_CR2) 


I2C_CR2 is another crucial control register that handles different aspects of I2C operation, including 
clock frequency, interrupt enable, and DMA control. Key bits in this register include the following: 


e FREQ[5:0] (peripheral clock frequency): These bits set the I2C peripheral clock frequency 
in MHz 


«e DMAEN (DMA requests enable): When set, this bit enables the DMA requests for the 
I2C peripheral 


You can find detailed information about this register on page 494 of the STM32F4 reference manual 
(RM0383). Next, lets look at the I2C Clock Control Register. 


12C Clock Control Register (l12C_CCR) 


I2C_CCR configures the clock control settings for standard, fast, and fast mode plus operations. Key 
bits in this register include the following: 


e CCR[11:0] (clock control): These bits set the clock control value, determining the I2C clock speed 
e DUTY (fast mode duty cycle): This bit selects the duty cycle for fast mode 


e F/S (I2C master mode selection): This bit selects between standard mode (0) and fast mode (1) 


You can find detailed information about this register on page 502 of the STM32F4 reference manual 
(RM0383). The next register is the I2C Rise Time Register. 
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12C TRISE register (I12C_TRISE) 


I2C_TRISE configures the maximum rise time for the I2C signals, ensuring compliance with I2C 
specifications. This register has only one field - TRISE[5:0] (maximum rise time). These bits set the 
maximum rise time for the SDA and SCL signals in nanoseconds. 


The final register is the I2C Data Register. 
12C Data Register (I2C_DR) 


I2C_DR is the data register used for both transmitting and receiving data. Data written to this 
register is transmitted, and data received from the bus is stored in this register. This register has only 
one field - DR[7:0] (8-bit data register): This register holds the 8-bit data to be transmitted or the 
data received from the bus. 


With these registers in mind, we're now ready to develop the I2C driver. Let's do that in the next section. 


Developing the I2C driver 


Let’s develop the I2C driver. Create a copy of your previous project in your IDE and rename this 
copied project 12C. Next, create a new file named i2c.c in the Src folder and another file named 
i2c.hin the Inc folder. 


The initialization function 


Let’s populate the i2c . c file, starting with the macros and initialization function: 


#include "stm32f4xx.h" 


#define GPIOBEN (1U<<1) 
#define I2C1EN (1U<<21) 
#define I2C_100KHZ 80 
#define SD MODE MAX RISE TIME ay 
#define CREE (1U<<0) 
#define SR2_ BUSY (1U<<1) 
#define CR1_ START (1U<<8) 
#define SR1_SB (1U<<0) 
#define SR1_ ADDR (1U<<1) 
#define SR1_TXE (1U<<7) 
#define CR1 ACK (1U<<10) 
#define CRASS LOR (1U<<9) 
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#define SR1_ RXNE (1U<<6) 
#define Shula ae (Gliviee”)) 
/* 

wo I ooo SCI 

a J) SsSoS SDA 

* */ 


void i2cl_init (void) 
/*Enable clock access to GPIOB*/ 
RCC->AHB1ENR | =GPIOBEN ; 


/*Set PB8 and PB9 mode to alternate function*/ 
GPIOB->MODER &=~(1U<<16) ; 
GPIOB->MODER | =(1U<<17 


GPIOB->MODER &=~ (1U<<18) ; 
GPIOB- >MODER |=(1U<<19 


/*Set PB8 and PB9 output type to open drain*/ 
GPIOB->OTYPER | =(1U<<8 
GPIOB->OTYPER |=(1U<<9 


/*Enable Pull-up for PB8 and PB9*/ 
GPIOB->PUPDR | =(1U<<16 z 
GPIOB->PUPDR &=~(1U<<17); 


GPIOB->PUPDR |=(1U<<18 
GPIOB->PUPDR &=~(1U<<19); 


/*Set PB8 and PB9 alternate function type to I2C (AF4)*/ 
GPIOB->AFR[1] &=~(1U<<0); 
GPIOB->AFR[1] &=~(1U<<1); 


GPIOB->AFR[1] |=(1U<<2) ; 
GPIOB->AFR[1] &=~(1U<<3) ; 
GPIOB->AFR[1] &=~(1U<<4) ; 
GPIOB->AFR[1] &=~(1U<<5) ; 


GPIOB->AFR[1] |=(1U<<6) ; 
GPIOB->AFR[1] &=~(1U<<7); 
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/*Enable clock access to 12C1*/ 
RCC->APB1ENR |= T2 EIEN 


/*Enter reset mode */ 
I2C1->CR1 |= (1U<<15); 


/*Come out of reset mode */ 
I2C1->CR1 &=~(1U<<15) ; 


/*Set Peripheral clock frequency*/ 
T2C1-sCR2 = (1U<<4); //16 Mhz 


/*Set I2C to standard mode, 100kHz clock */ 
IZCILI- = 2E LOOK, 


/*Set rise time */ 
I2C1->TRISE = SD MODE MAX RISE TIME; 


/*Enable I2C1 module */ 
I2C1->CR1 |= CR1_PE; 


Let’s break down what we have so far: 


RCC->AHB1ENR |= GPIOBEN; 


This line enables the clock for GPIOB by setting the corresponding bit. 


GPIOB->MODER &=~ (1U<<16) ; 
GPIOB->MODER |=(1U<<17) ; 
GPIOB->MODER &=~ (1U<<18) ; 
GPIOB- >MODER EGU 


These lines configure PB8 and PB9 pins to an alternate function mode for I2C. 


GPIOB->OTYPER |=(1U<<8) 
GPIOB->OTYPER |=(1U<<9) 


These lines configure the pins as open-drain, which is required for 12C communication. 


GPIOB->PUPDR EGUS ie 
GPIOB->PUPDR &=~ (1U<<17) ; 
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GPIOB->PUPDR |=(1U<<18) ; 
GPIOB->PUPDR &=~(1U<<19) ; 


These lines enable pull-up resistors for the I2C pins. 


GPIOB->AFR[1] &=~(1U<<0); 


GPIOB->AFR[1] &=~(1U<<1); 
GPIOB->AFR[1] |=(1U<<2); 
GPIOB->AFR[1] &=~ (1U<<3); 


GPIOB->AFR[1] &=~(1U<<4) ; 
GPIOB->AFR[1] &=~(1U<<5); 
GPIOB->AFR[1] |=(1U<<6) ; 
GPIOB->AFR[1] &=~ (1U<<7) ; 


These lines configure the alternate function for the pins to 12C1. 
RCC- >APB1ENR le I2C1EN; 


This line enables the clock for the I2C1 peripheral. 


TCI CR je (ilies) 2 
I2C1->CR1 &=~(1U<<15) ; 


These lines reset the I2C1 peripheral. 
UAC SSeS (alert) 
This configures the I2C1 clock. 


I2C1-5CCR = I2C_100KHZ; 


This line sets the clock control register for 100 kHz standard mode, using the macro we defined. 


I2C1->TRISE = SD MODE MAX RISE TIME; 


This line sets the rise time for the I2C signals using the macro we defined. The TRISE register specifies 
the maximum time the signal is allowed to take to transition from a low to a high state on the I2C 
bus. Setting this value correctly is important to ensure that the I2C communication adheres to the 
timing requirements of the I2C standard, which helps maintain reliable and stable communication. 


I2C1->CR1 |= CR1_PE; 


This line enables the I2C1 peripheral by setting the PE bit. Next, we will add and analyze the function 
to read a byte from an I2C slave device. 
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The read function 


Let’s analyze the read function: 


void i2cl byte read(char saddr, char maddr, char* data) { 
volatile int tmp; 


/* Wait until bus not busy */ 
while (I2C1->SR2 & (SR2_ BUSY)) {} 


/* Generate start */ 
I2C1->CR1 |= CR1_START; 


/* Wait until start flag is set */ 
while (!(I2C1->SR1 & (SR1_SB))) {} 


/* Transmit slave address + Write */ 
I2C1->DR = saddr << 1; 


/* Wait until addr flag is set */ 
while (!(I2C1->SR1 & (SR1_ADDR))) {} 


/* Clear addr flag */ 
(edhe) =e IEAIL = SISA p 


/* Send memory address */ 
I2C1->DR = maddr; 


/*Wait until transmitter empty */ 
while (!(I2C1->SR1 & SR1_TXE)) {} 


/*Generate restart */ 
T2C1->CR1 |= CR1_START; 


/* Wait until start flag is set */ 
while (!(I2C1->SR1 & SR1_SB)) {} 


/* Transmit slave address + Read */ 
I2C1->DR = saddr << 1 | 1; 


/* Wait until addr flag is set */ 
while (!(I2C1->SR1 & (SR1_ADDR))) {} 
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} 


/* Disable Acknowledge */ 
T2C1->CR1 &= ~CR1 ACK; 


/* Clear addr flag */ 
(eins: = WAIL SSSR} 


/* Generate stop after data received */ 
I2C1->CR1 |= CR1_STOP; 


/* Wait until RXNE flag is set */ 
while (!(I2C1->SR1 & SR1_RXNE)) {} 


/* Read data from DR */ 
*data++ = I2C1->DR; 


Let’s break down what we have so far: 


while (I2C1l->SR2 & (SR2_BUSY) ) {}: This line waits for the I2C bus to be free by 
checking the state of the BUSY bit in I2C Status Register 2. 


I2C1->CR1 |= CR1_START;: This line initiates a start condition on the I2C bus. 


while (!(I2C1->SR1 & (SR1_SB))) {}: This line waits until the start condition is 
acknowledged by checking the SB bit in I2C Status Register 1. 


I2C1->DR = saddr << 1;: This line sends the slave address with the write bit. The 7-bit 
address of the device is left-shifted by 1 bit to make room for the R/W bit in the least significant 
bit (LSB) position. By only shifting saddr left by 1, we prepare the address for a subsequent 
write operation to the slave device. 


while (!(I2C1->SR1 & (SR1_ADDR) ) ) {}: This line waits until the address is 
acknowledged by checking the ADDR bit in I2C Status Register 1. 


tmp = I2C1->SR2;: This line clears the address flag by simply reading I2C Status Register 2. 
2C1->DR = maddr;: Here, we send the memory address to read from the slave device. 


while (!(I2C1->SR1 & SR1_TXE) ) {}: This line waits until the data register is empty 
by reading the transmit buffer empty (TXE) bit in I2C Status Register 1. 


I2C1->CR1 |= CR1_START;: This line initiates a restart condition on the I2C bus. 


while (!(I2C1->SR1 & SR1_SB)) {}: Here, we wait until the restart condition is 
acknowledged by checking the SB bit in I2C Status Register 1. 
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e I2C1->DR = saddr << 1 | 1;: This line prepares the I2C data register for a read 
operation by setting up the 7-bit I2C address of the slave device and appending the R/W bit. 
Specifically, saddr << 1 shifts the 7-bit address left by one bit to make room for the LSB, 
which is then set to 1 using the bitwise OR operator (| 1). This final value, with the LSB set to 
1, indicates a read operation when loaded into the I2C1 data register (DR). Hence, this line 
configures the I2C peripheral to initiate communication with the slave device, requesting to 
read data from it. 


e while (!(I2C1->SR1 & (SR1_ADDR) ) ) {}: This line waits until the address 
is acknowledged. 


e I2C1->CR1 &= ~CR1_ACK;;: This line disables the acknowledge bit to prepare for a 
stop condition. 


e tmp = I2C1->SR2;: This line clears the address flag by reading I2C Status Register 2. 
e I2C1->CR1 |= CR1_STOP;: This initiates a stop condition on the I2C bus. 


e while (!(I2C1->SR1 & SR1_RXNE) ) {}: This line waits until the receive buffer is not 
empty by reading the Receive Buffer Register Not Empty (RXNE) flag in I2C Status Register 
1. This flag indicates that new data has been received and is available in the data register. 


e *data++ = I2C1->DR;: This line is responsible for storing the received byte of data from 
the I2C DR in the memory location pointed to by the data pointer. 


Next, we have a function to read multiple bytes from the slave device: 
void i2cl burst _read(char saddr, char maddr, int n, char* data) { 


volatile int tmp; 


/* Wait until bus not busy */ 
while (I2C1->SR2 & (SR2_BUSY)) {} 


/* Generate start */ 
I2C1->CR1 |= CR1_START; 


/* Wait until start flag is set */ 
while (!(I2C1->SR1 & SR1_SB)) {} 


/* Transmit slave address + Write */ 
I2C1->DR = saddr << 1; 


/* Wait until addr flag is set */ 
while (!(I2C1->SR1 & SR1_ADDR)) {} 


/* Clear addr flag */ 
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tmp = T2C1->SR2; 


/* Wait until transmitter empty */ 
while (!(I2C1->SR1 & SR1_TXE)) {} 


/*Send memory address */ 
I2C1->DR = maddr; 


/*Wait until transmitter empty */ 
while (!(I2C1->SR1 & SR1_TXE)) {} 


/*Generate restart */ 
I2C1->CR1 |= CR1_START; 


/* Wait until start flag is set */ 
while (!(I2C1->SR1 & SR1_SB)) {} 


/* Transmit slave address + Read */ 
I2C1->DR = saddr << 1 | 1; 


/* Wait until addr flag is set */ 
while (!(I2C1->SR1 & (SR1_ADDR))) {} 


/* Clear addr flag */ 
(eis) = WAC SSR y 


/* Enable Acknowledge */ 
I2C1->CR1 |= CRI ACK; 


while(n > 0U) 
{ 
/*if one byte*/ 
pura ial == ALU) 
{ 
/* Disable Acknowledge */ 
TACIE > CRS — a CRACKS, 


/* Generate Stop */ 
I2C1->CR1 |= CR1_STOP; 


/* Wait for RXNE flag set */ 
while (!(I2C1->SR1 & SR1_RXNE)) {} 
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/* Read data from DR */ 
*data++ = I2C1->DR; 


break; 

else 
/* Wait until RXNE flag is set */ 
while (!(I2C1->SR1 & SR1_RXNE)) {} 
/* Read data from DR */ 
(*data++) = I2C1->DR; 
ne 


} 


This function reads multiple bytes of data from a specified memory address in the I2C slave device. 
Here’s a breakdown of what it does: 


1. Waits for bus availability: The function starts by ensuring that the I2C bus is not busy, waiting 
until it is free to initiate communication. 


2. Generates a start condition: It generates a start condition to begin communication with the 
slave device. 


3. Transmits a slave address for write: The function sends the slave device address with a write 
bit, indicating that it will initially write data to specify the memory address. 


4. Waits for the address flag and clears it: It waits for the address flag to be set and then clears 
it by reading the SR2 register. 


5. Transmits the memory address: The memory address from which to start reading is sent to 
the slave device. 


6. Generates a restart condition: A repeated start condition is generated to switch the communication 
mode from write to read. 


7. Transmits a slave address for read: The function sends the slave address with a read bit, 
indicating that it will read data from the slave device. 


8. Waits for the address flag and clears it: Again, it waits for the address flag to be set and clears 
it by reading the SR2 register. 


9. Enables acknowledge: The acknowledge bit is set. 


278 Inter-Integrated Circuit (12C) 


10. Reads a data loop: The function enters a loop to read the specified number of bytes: 


* A single-byte read: If only one byte is left to read, the acknowledge bit is cleared, a stop 
condition is generated, and the data is read into the buffer 


- A multiple-byte read: If more than one byte is to be read, it waits for the RXNE flag to 
indicate that data is ready, reads the data into the buffer, and decrements the byte counter 


Finally, we add the function to write data to the slave device. 
The write function 
Let’s break down the function to write multiple bytes to the slave device: 
void i2cl burst write(char saddr, char maddr, int n, char* data) { 


volatile int tmp; 


/* Wait until bus not busy */ 
while (I2C1->SR2 & (SR2_BUSY)) {} 


/* Generate start */ 
I2C1->CR1 |= CR1_START; 


/* Wait until start flag is set */ 
while (!(I2C1->SR1 & (SR1_SB))) {} 


/* Transmit slave address */ 
I2C1->DR = saddr << 1; 


/* Wait until addr flag is set */ 
while (!(I2C1->SR1 & (SR1_ADDR))) {} 


/* Clear addr flag */ 
‘emo = WACO SSIs 


/* Wait until data register empty */ 
while (!(I2C1->SR1 & (SR1_TXE))) {} 


/* Send memory address */ 
I2C1->DR = maddr; 


for (int i = 0; i < n; i++) { 


/* Wait until data register empty */ 
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while (!(I2C1->SR1 & (SR1_TXE))) {} 


/* Transmit memory address */ 
I2C1->DR = *data++; 


/* Wait until transfer finished */ 
while (!(I2C1->SR1 & (SR1_BTF))){} 


/* Generate stop */ 
I2C1->CR1 |= CR1_STOP; 


} 


This function writes multiple bytes of data to a specific memory address in the I2C slave device. The 
function begins by waiting for the I2C bus to be free, ensuring that there is no ongoing communication. 
It then generates a start condition to initiate communication with the slave device. The slave address 
is transmitted with a write bit, and then the function waits for the address flag to be set and cleared by 
reading the SR2 register. After ensuring the data register is empty, it sends the memory address where 
the data writing should begin. The function enters a loop to transmit each byte of data, waiting for the 
data register to empty before each byte is sent. Once all bytes have been transmitted, it waits for the 
byte transfer to finish and then generates a stop condition to end the communication. 


Our next task is to populate the i2c . h file. 
The header file 


Here is the code for the header file: 


#ifndef I2C H_ 
#define I2C H_ 


void i2cl init (void); 

void i2cl byte read(char saddr, char maddr, char* data); 

void i2cl burst read(char saddr, char maddr, int n, char* data); 
void i2cl burst _write(char saddr, char maddr, int n, char* data); 


#endif 


Let’s update our driver for the adxl345 device to use the I2C driver we developed. 
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The ADXL345 I2C driver 


Update the current adx1345 . c in the Src folder: 


#include 


"adx1345.h" 


// Variable to store single byte of data 


char data; 


// Buffer to store multiple bytes of data from the ADXL345 
uint8 t data_buffer[6]; 


void adxl_ read address (uint8 t reg) 


{ 


i2cl_byte read( ADXL345 DEVICE ADDR, reg, &data) ; 


void adxl write (uint8 t reg, char value) 


{ 


char data[1]; 


data [0] 


value; 


i2cl_burst_write( ADXL345 DEVICE ADDR, reg,1, data) ; 


void adxl read values (uint8 t reg) 


{ 


// Read 6 bytes into wthe data buffer 
i2cl_burst_read (ADXL345_DEVICE_ADDR, reg, 6, (char *)data_buffer) ; 


void adxl_init (void) 


{ 


/*Enable I2C*/ 


Ae Alva 


/*Read t 


EK 


ne 


da 


DEVID, this should return OxE5*/ 


adxl_read_address (ADXL345 REG DEVID) ; 


/*Set da 


ta 


adxl_write 
/*Reset all bits*/ 


adxl_ wri 


ES 


format range to +-4g*/ 
(ADXL345 REG DATA FORMAT, ADXL345 RANGE 4G) ; 


(ADXL345 REG POWER_CTL, ADXL345 RESET) ; 
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} 


/*Configure power control measure bit*/ 
adxl_write (ADXL345 REG POWER CTL, ADXL345 MEASURE BIT) ; 


Here is a breakdown of the code: 


The adxl_read_address function reads a single byte of data from a specified register in 
the ADXL345 accelerometer. It uses the i2c1_byte_read function to communicate over 
the I2C bus, fetching the data from the register identified by the reg parameter and storing 
it in the data variable. 


The adx1_write function writes a single byte of data to a specific register in the ADXL345. 
It prepares a single-element array, containing the value to be written, and then uses i2c1_ 
burst_write to send this data to the register specified by the reg parameter, over the 
I2C interface. 


The adxl_read_values function reads a block of data from the ADXL345 - specifically, 
6 bytes, which is the size required to capture the accelerometer’s X, Y, and Z axis data. It uses 
i2c1_burst_read to pull this data, starting from the register specified by the reg parameter, 
and stores it in data_buf fer for further processing. 


The adxl_ init function initializes the ADXL345 accelerometer. It first enables I2C 
communication by calling i2c1_init, and then it checks the device's identity by reading 
the DEVID register. Following this, it configures the data format to a range of +4g, resets the 
power control register, and finally, sets the power control register to start measuring acceleration. 


Next, we update the current adx1345 .h in the Inc folder: 


#ifndef ADXL345 H_ 
#define ADXL345 H 


#include "i2c.h" 
#include <stdint.h> 


#define ADXL345 REG DEVID (0x00) 
#define ADXL345 DEVICE ADDR (0x53) 
#define ADXL345 REG DATA FORMAT (0x31) 
#define ADXL345 REG POWER CTL (0x2D) 
#define ADXL345 REG DATA START (0x32) 
#define ADXL345 REG DATA FORMAT (0x31) 
#define ADXL345_ RANGE 4G (0x01) 
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#define ADXL345 RESET (0x00) 
#define ADXL345 MEASURE BIT (0x08) 
void adxl init (void); 

void adxl read values(uint8 t reg); 


#endif 


We are now ready to test the driver inside the main . file. 
The main function 


Update your main . c file, as shown here: 


#include <stdio.h> 
#include <stdint.h> 
#include "stm32f4xx.h" 
#include "uart.h" 
#include "adx1345.h" 


//Variables for storing accelerometer data 
IME c Accel s ECE SY, ecel a; 
double accel_x Cj accel y Ch accel zg; 


extern uint8 t data_buffer[6]; 


int main (void) 


teare toie p 


// Initialize the ADXL345 accelerometer 
adxl_ init (); 


while (1) 
// Read accelerometer data starting from the data start 
// register 
adxl_read_values (ADXL345 REG DATA START) ; 


// Combine high and low bytes to form the accelerometer data 
accel. — imele e) C(datarburter [il] << 8) | carea owr rer lOl 5 
accel y = Cimcig t)i (dete wr rerlsl <= 8) | data_buffer[2]); 


Summary 


accel, = (Gntill6 e) (Cere mireris << 6) | data_buffer[4]); 


// Convert raw data to g values 
aeee os te) = ele@ell a. OQ 00E 
clea Sy tej = cecal iy 0- 007E 
accel a g = ACCEL z = 0.00757 


printf ("accel x : <d accel _y : sd accellz < sd\n\ 
ig) ACCC sx, aeee WW, accel ~)) p 


return 0; 


} 


It’s time to test the project. To do so, compile the code and run it on your microcontroller. Open 
RealTerm or another serial terminal application, and then configure it with the appropriate port and 
baud rate to view the debug messages. Press the black push button on the development board to reset 
the microcontroller. You should see the X, Y, and Z accelerometer values continuously being printed. 
Try moving the accelerometer to see how the values change significantly. 


Summary 


In this chapter, we learned about the I2C communication protocol. We began by discussing the 
fundamental principles of the I2C protocol, including its modes of operation, addressing schemes, 
and the step-by-step communication process. 


We then delved into the specifics of the STM32 I2C peripheral, highlighting key registers such as 
Control Register 1 (IZ2C_CR1), Control Register 2 (I2C_CR2), the Clock Control Register (IZ2C_CCR), 
and the Data Register (IZC_DR). 


Finally, we applied this theoretical knowledge to develop a bare-metal I2C driver. This driver allows 
us to initialize the I2C peripheral, perform both single-byte and burst data transfers, and handle 
communication with an external device such as the ADXL345 accelerometer. 


In the next chapter, we will learn about interrupts, a critical feature in modern microcontrollers that 
enables responsive and efficient handling of real-time events. 
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In this chapter, we will learn about interrupts and their critical role in embedded systems development. 
Interrupts are pivotal for creating responsive and efficient firmware, allowing microcontrollers to 
handle real-time events effectively. By understanding interrupts, you can develop systems that can 
react promptly to external stimuli, making your embedded applications more robust and versatile. 


We will begin by exploring the fundamental role of interrupts in firmware, contrasting them with 
exceptions to highlight their unique purposes and handling mechanisms. Following this, we will dive 
into the specifics of the Interrupt Service Routine (ISR), the Interrupt Vector Table (IVT), and the 
Nested Vectored Interrupt Controller (NVIC), which collectively form the backbone of interrupt 
handling in Arm Cortex-M microcontrollers. 


Next, we will focus on the STM32 External Interrupt (EXTI) controller, an essential peripheral for 
managing external interrupts in STM32 microcontrollers. We will examine the key features and registers 
of the EXTI controller, learning how to configure and utilize it for various applications. 


Finally, we will apply this knowledge by developing an EXTI driver, providing you with practical 
experience in implementing interrupt-driven firmware. This hands-on approach will solidify your 
understanding and enable you to create responsive, interrupt-based systems. 


In this chapter, we will cover the following main topics: 
e Interrupts and their role in firmware 
e The STM32 EXTI controller 
e Developing the EXTI driver 
By the end of this chapter, you will have a comprehensive understanding of interrupts and how to 


develop bare-metal EXTI drivers for STM32 microcontrollers, empowering you to create responsive 
and efficient embedded systems. 
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Technical requirements 


All the code examples for this chapter can be found on GitHub at the following link: 


https://github.com/PacktPublishing/Bare-Metal -Embedded-C- Programming 


Interrupts and their role in firmware 


Interrupts are one of the most critical mechanisms in embedded systems, allowing microcontrollers to 
react to real-time events efficiently. To fully appreciate their role in firmware, it’s essential to understand 
what interrupts are, how they work, and the various scenarios where they prove indispensable. So, let’s 
dive in and explore the fascinating world of interrupts, their operation, and their practical applications. 


What are interrupts? 


Imagine you're deeply engrossed in reading a book but then the doorbell rings. You momentarily 
stop reading, attend to the visitor, and then return to your book. Interrupts in microcontrollers work 
similarly. They are signals that temporarily halt the current execution of a program to allow a special 
routine, known as an ISR, to run. Once the ISR completes, the microcontroller resumes its previous 
task right where it left off. 


Interrupts can be triggered by hardware events, such as a timer overflow, a key press, or data reception 
on a communication interface. They can also be generated by software, providing a flexible way to 
manage both external and internal events. Now, let’s see how they work. 


How do interrupts work? 


At the heart of interrupt handling is the concept of context switching. When an interrupt occurs, the 
microcontroller saves its current state—essentially, a snapshot of all the important information, such 
as the program counter and CPU registers. This allows the microcontroller to pause its current task, 
execute the ISR, and then restore the saved state to continue where it left off. The process typically 
follows these steps: 


1. Interrupt request: An event triggers an Interrupt Request (IRQ). 


2. Acknowledge and prioritize: The interrupt controller acknowledges the request and prioritizes 
it based on predefined levels. 


3. Context save: The CPU saves its current execution context. 
4. Vector fetch: The CPU fetches the address of the ISR from the IVT. 
5. ISR execution: The ISR runs to handle the interrupt. 


Interrupts and their role in firmware 


6. Context restore: After the ISR completes, the CPU restores the saved context. 


7. Resume execution: The CPU resumes the interrupted task. 


You may wonder why interrupts are important. Let’s find out. 


Importance of interrupts in firmware 


Interrupts are important for creating efficient and responsive embedded systems. Here are some key 
reasons why they are so important: 


e Real-time response: Interrupts allow a microcontroller to react almost instantaneously to 
critical events. For instance, in a motor control system, an interrupt can immediately handle a 
sensor signal indicating that the motor has reached its desired position. 


e Resource optimization: Instead of constantly polling for events (which wastes CPU cycles and 
power), interrupts enable the CPU to remain in a low-power state or focus on other tasks until 
an event occurs. This optimization is crucial for battery-powered devices such as wearables 
or remote sensors. 


e Prioritization and preemption: Interrupts can be prioritized, allowing more critical tasks to 
preempt less critical ones. This ensures that high-priority tasks, such as emergency stop signals 
in industrial machinery, are addressed immediately. 


When discussing interrupts, we often encounter another key term: exceptions. Although they share 
similarities, they serve different purposes in embedded systems. Let’s explore the differences between 
interrupts and exceptions. 


Interrupts versus exceptions 


Interrupts are signals from hardware or software indicating an event that needs immediate attention. 
Examples include timer overflows, GPIO pin changes, and peripheral data reception. 


As we learned earlier, interrupts enable embedded systems to handle real-time events and are essential 
for responsive and efficient system behavior. When an interrupt occurs, the CPU stops executing the 
main program and jumps to a predefined address to execute the ISR. 


Exceptions are events that disrupt the normal execution flow, often due to errors such as divide-by-zero 
operations or accessing invalid memory addresses. While similar to interrupts, exceptions typically 
handle error conditions and system-level events. 
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Differences between interrupts and exceptions 
Here are some of the differences between the two: 


e Source: Interrupts usually originate from external hardware devices or other peripherals within 
the microcontroller, while exceptions are typically the result of internal CPU operations 


e Purpose: Interrupts manage real-time events, whereas exceptions handle error conditions and 
system anomalies 


e Handling: Both use ISRs, but exceptions often involve more complex error handling and 
recovery mechanisms 


To properly understand how interrupts are handled, we need to examine the three key components 
involved: the NVIC, the ISR, and the IVT. 


The NVIC, ISR, and IVT 


The NVIC in Arm Cortex-M microcontrollers, such as the STM32 series, plays a pivotal role in 
managing interrupts. Let’s explore what the NVIC is, how it works, and why it’s so important for 
embedded development. 


The NVIC 


The NVIC is a hardware module integrated into Arm Cortex-M microcontrollers that manages the 
prioritization and handling of interrupts. It enables the microcontroller to respond to interrupts 
quickly and efficiently, while also allowing for nested interrupts, where higher-priority interrupts 
can preempt lower-priority ones. This capability is essential for real-time applications where timely 
responses to events are critical. 


Its key features include the following: 


e Interrupt prioritization: The NVIC supports multiple priority levels, allowing us to assign 
different priorities to different interrupts. This ensures that more critical tasks are handled first. 


e Nested interrupts: The NVIC allows higher-priority interrupts to interrupt lower-priority 
ones. This feature is crucial for maintaining system responsiveness in real-time applications. 


e Dynamic priority adjustment: We can dynamically adjust the priority of interrupts during 
runtime, providing flexibility to adapt to changing conditions. 


The diagram in Figure 14.1 illustrates the NVIC and its connections to various components 
within the microcontroller. 
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SysTick 
EXTI — Cortex-M 
ADC =a Processor 
UART l NVIC 
TIM —— Exceptions 


Figure 14.1: The NVIC 


Next, we have the ISR. 


The ISR 


The ISR is a crucial piece of the puzzle when it comes to handling interrupts in embedded systems. 
The ISR is a specialized function that the CPU executes in response to an interrupt. Every interrupt 
has its own ISR. When an interrupt occurs, the CPU temporarily halts its current task, saves its state, 
and jumps to the ISR’s predefined address (function) to execute the necessary code. 


The final critical component is the IVT. 


The IVT 


The IVT is like the roadmap for the CPU when an interrupt occurs. It’s a data structure that holds the 
addresses of all the ISRs. Each interrupt source is assigned a specific entry in this table, mapping it to 
its corresponding ISR. When an interrupt is triggered, the CPU consults the IVT to find the address 
of the ISR associated with that interrupt. This lookup ensures that the CPU can quickly and efficiently 
jump to the right piece of code to handle the event. 


Its key features include the following: 


e Fixed location: The IVT is typically located at a fixed address in memory. For Arm Cortex-M 
microcontrollers, it starts at address 0X00000000. 


e ISR addresses: Each entry in the IVT contains the address of an ISR. When an interrupt occurs, 
the CPU uses the IVT to quickly locate and jump to the appropriate ISR. 


e Vector numbers: Each interrupt source is assigned a unique vector number, which corresponds 
to an entry in the IVT. For example, vector number 0 might be for a reset interrupt, vector 
number 1 for a non-maskable interrupt (NMI), and so on. 


e Configurable: In many systems, the IVT can be configured during system initialization to 
point to the ISRs defined in your firmware. 
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Comparative analysis—interrupt-driven solutions versus polling- 
based solutions 


But what happens when we don’t use interrupts? Let’s explore some real-world case studies to illustrate 
the difference between solutions using interrupts and those that rely on polling. 


Case study 1—button debouncing 


In embedded systems, handling user inputs such as button presses is a common task. However, 
the way these inputs are managed can significantly impact the efficiency and responsiveness of the 
system. One particular issue is dealing with the problem of “bouncing,” where a mechanical button 
generates multiple rapid signals due to its physical characteristics. This can lead to incorrect readings 
and erratic behavior if not properly managed. In this case study, we will explore two approaches to 
handling button debouncing: one without interrupts and the other using interrupts. 


We'll start with the approach that doesn’t use interrupts. 


Imagine you have a simple user interface with a button that, when pressed, toggles an LED. Without 
interrupts, the most straightforward approach is to continuously check (or “poll”) the button’s state in 
the main loop. This involves repeatedly reading the button’s input pin to see whether it has changed 
from high to low, indicating a press. The problem here is that mechanical buttons can generate spurious 
signals due to physical bouncing, leading to multiple detections of a single press. To handle this, you 
need to add a delay after detecting a press, effectively ignoring further signals for a short period. 


There are a couple of drawbacks: 


e Inefficiency: The CPU is constantly busy checking the button state, wasting valuable processing 
time that could be used for other tasks 


e Delayed response: Adding a delay to handle debouncing means the system might miss other 
important tasks while waiting for the button to stabilize 


Let’s look at the approach with interrupts. 


Using interrupts, you configure the button pin to generate an interrupt on a falling edge (when the 
button is pressed). The ISR is triggered immediately when the button is pressed, handling the debouncing 
logic. The main loop remains free to perform other tasks without the overhead of constantly polling 
the button. 


There are a couple of benefits: 


e Efficiency: The CPU can focus on other tasks and only respond to the button press when necessary 


e Immediate response: The ISR responds instantly to the button press, making the system 
more responsive 
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Case study 2—sensor data acquisition 


Another common task in embedded systems is acquiring data from sensors, especially in applications 
such as weather stations, where multiple sensors continuously monitor environmental conditions. The 
method used to handle sensor data acquisition can greatly affect the system’s complexity and efficiency. 
Lets compare two approaches: one without using interrupts and the other leveraging interrupts to 
optimize the process. 


We'll start with the approach that doesn’t use interrupts. 


Consider a weather station that reads data from various sensors such as temperature, humidity, and 
pressure at regular intervals. Without interrupts, the main loop would include code to periodically read 
data from each sensor. This could be done using timers to create delays between readings, ensuring 
data is acquired at the right intervals. 


There are a couple of drawbacks: 


e Complexity: Managing multiple sensors with precise timing using polling can lead to a 
complicated main loop 


¢ Inefficiency: The main loop might spend a lot of time waiting for timers to expire—again, 
wasting CPU resources 


Let’s look at the approach with interrupts. 


Using interrupts, each sensor can trigger an interrupt when new data is available. The ISR for each 
sensor reads the data and stores it in a buffer for the main loop to process later. This approach decouples 
data acquisition from the main loop, allowing it to focus on data processing and other tasks. 


There are a couple of benefits: 


e Simplified code: The main loop is cleaner and easier to manage, as it doesn't need to handle 
timing and sensor polling directly. 


e Resource efficiency: The CPU spends less time waiting and more time processing, leading to 
more efficient use of resources. 


Case study 3—communication protocols 


Now, let’s see how interrupts can improve communication. Effective communication between a 
microcontroller and external devices is crucial for many embedded systems, whether youre dealing 
with sensors, displays, or other peripherals. The approach you take to manage data transmission and 
reception can have a significant impact on your system's performance, particularly in terms of CPU load 
and latency. Let’s analyze two methods for handling communication over Universal Asynchronous 
Receiver/Transmitter (UART): one without using interrupts and the other leveraging interrupts to 
optimize the process. 
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We'll start with the approach that doesn’t use interrupts. 


Let’s look at a scenario where a microcontroller communicates with another device over UART. 
Without interrupts, the firmware would continuously check the UART status register to see whether 
new data has arrived or whether the transmitter is ready to send data. This polling approach ensures 
no data is missed, but it can be very CPU-intensive. 


There are a couple of drawbacks: 


e Resource efficiency: Continuous polling keeps the CPU busy, leaving less processing power 
for other tasks 


e Latency: The time between data arrival and processing depends on how frequently the UART 
status is checked 


Let’s look at the approach with interrupts. 


Enabling UART interrupts allows the microcontroller to handle data reception and transmission 
events automatically. When new data arrives, an interrupt is triggered, and the ISR reads the data 
and processes it. Similarly, when the transmitter is ready, another interrupt can handle sending data. 


There are a couple of benefits: 


e Low CPU load: The CPU can perform other tasks and only deal with UART events when necessary 


e Real-time handling: Data is processed immediately upon arrival, reducing latency and improving 
communication efficiency 


Interrupts provide a powerful and efficient way to handle real-time events in embedded systems. 
Compared to polling methods, interrupts offer significant benefits in terms of responsiveness, 
efficiency, and code simplicity. By understanding and leveraging interrupts, you can develop more 
robust and efficient firmware, capable of handling a wide range of real-time applications. Whether 
its managing user inputs, acquiring sensor data, handling communication protocols, or maintaining 
accurate timekeeping, interrupts are an indispensable tool. 


In the next section, we shall explore the STM32 EXTI controller peripheral. 


The STM32 EXTI controller 


The EXTI module in STM32 microcontrollers is designed to manage external interrupt lines. These 
lines can be triggered by signals on GPIO pins, enabling your microcontroller to react to changes in 
the external environment quickly and efficiently. 


The EXTI controller is equipped with a range of features that enhance its flexibility and utility in 
embedded systems. Let’s explore these features in detail and understand their practical applications. 


The STM32 EXTI controller 


Key features of the EXTI 


Here are the key features of the EXTI module that make it a versatile and powerful tool for managing 
external and internal events in STM32 microcontrollers: 


e Provides up to 23 independent interrupt/event lines, with up to 16 from GPIO pins, and the 
rest from internal signals 

e Each line can be independently configured as either an interrupt or an event 

e Edge detection options for each line: rising, falling, or both edges 

e Dedicated status flags for each line to indicate pending interrupts/events 

e Ability to generate software interrupts/events 


An event registers that something happened by setting a corresponding status flag bit, but does not 
trigger an interrupt or execute any code (ISR). 


For example, events can be used to wake the system from sleep mode without executing an ISR. 


To use the EXTI for generating interrupts, we need to configure and enable the interrupt line properly. 
This involves programming the trigger registers to detect the desired edge (rising, falling, or both) 
and enabling the IRQ by setting the appropriate bit in the interrupt mask register. When the specified 
edge is detected on the external interrupt line, an IRQ is generated, and the corresponding pending 
bit is set. This pending bit must be cleared by writing a 1 to it in the pending register. 


To generate events, we simply configure the event line by setting the appropriate trigger registers and 
enabling the corresponding bit in the event mask register. 


We can also generate interrupts/events by software by writing a 1 to the software interrupt/event 
register (EXTI_SWIER). 


Configuring the lines as interrupt sources involves three steps: 
1. Configure mask bits: Set the mask bits for the 23 interrupt lines using the Interrupt Mask 
Register (EXTI_IMR). 


2. Configure trigger selection bits: Use the Rising Trigger Selection Register (EXTI_RTSR) 
and Falling Trigger Selection Register (EXTI_FTSR) to set the desired trigger conditions 
for the interrupt lines. 


3. Enable the NVIC IRQ channel: Over here, we simply configure the enable and mask bits that 
control the NVIC IRQ channel mapped to the EXTI. 


Before we develop the EXTI driver, let's first understand how the EXTI lines are mapped to the GPIO pins. 
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External interrupt/event line mapping 


The EXTI controller can connect up to 81 GPIOs (in the STM32F411xC/E series) to the 16 external 
interrupt/event lines. The GPIO pins are mapped to the EXT] lines through the SYSCFG_EXTICR registers. 


GPIO pins and EXTI lines 


Each GPIO pin on the STM32 microcontroller can be connected to an EXT] line, allowing it to generate 
external interrupts. This flexibility means you can enable interrupts for any GPIO pin, but theres a 
catch: multiple pins share the same EXTI line. This sharing is based on the pin number, not the port. 


Here’s a breakdown of how the pins are connected: 


e Pin 0 of every port is connected to EXTIO_IRQ 
e Pin 1 of every port is connected to EXTI1_IRQ 
e Pin 2 of every port is connected to EXTI2_IRQ 
e Pin 3 of every port is connected to EXTI3_IRQ 


e Andso on... 


This mapping means that all pins with the same number across different ports share the same EXTI 
line. For example, PAO, PBO, PCO, PDO, PEO, and PHO are all connected to EXTIO. 


Important note 


Shared EXTI lines: Since multiple pins share the same EXT] line, you cannot enable interrupts 
on two pins with the same number across different ports simultaneously. For instance, if you 
enable an interrupt on PBO, you cannot also enable an interrupt on PAO because both pins 
share EXTIO. 


Configuration in SYSCFG_EXTICR: The SYSCFG external interrupt configuration registers 
(EXTICRs) are used to select which port’s pin will be connected to a particular EXTI line. 
This selection ensures that only one pin from one port can be the source for a given EXTI line. 


Developing the EXTI driver 


The STM32 EXTI module relies on several key registers to configure its operation. These registers 
allow you to set up trigger conditions, enable interrupts, and manage pending interrupt requests. 
Understanding these registers is crucial for effectively using the EXTI module in your embedded projects. 
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EXTI_IMR 


We use the EXTI_IMR to enable or disable interrupts on each EXTI line. 


The bits in this register are named MRx. They are mask bits for each EXTI line (x = 0 to 22). Setting 
a bit to 1 unmasks the interrupt line, allowing it to generate an interrupt request. Conversely, setting 
it to 0 masks the line, preventing it from generating interrupts. 


EXTI_RTSR 


The EXTI_RTSR configures the rising edge trigger for each EXTI line. When a rising edge is detected 
on a line configured in this register, it can generate an interrupt or an event. 


The bits in this register are named TRx. They are trigger selection bits for each EXTI line (x = 0 to 
22). Setting a bit to 1 configures the line to trigger on a rising edge. 


EXTI_FTSR 


The EXTI_FTSR is used to configure the falling edge trigger for each EXTI line. When a falling edge 
is detected on a line set in this register, it can generate an interrupt or an event. 


The bits in this register are named TRx. They are trigger selection bits for each EXTI line (x = 0 to 
22). Setting a bit to 1 configures the line to trigger on a falling edge. 
Pending Register (EXTI_PR) 


The EXTI_PR indicates which EXTI lines have pending interrupt requests. This register is also used 
to clear pending interrupts by writing a 1 to the appropriate bit. 


The bits in this register are named PRx. They are pending bits for each EXTI line (x = 0 to 22). A bit 
set to 1 indicates a pending interrupt request. Writing 1 to the bit clears the pending request. 


Let’s configure PC13 as an EXTI pin. 


The EXTI driver 


Create a copy of your previous project in your IDE and rename this copied project EXTI. Next, create 
a new file named gpio_exti.cinthe Src folder and another file named gpio_exti.h in the 
Inc folder. 


Populate gpio_exti.c with the following code: 


#include "gpio exti.h" 


#define GPIOCEN (lU<<2)) 
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#define SYSCFGEN (1U<<14) 


void pcl3 exti_init (void) 

{ 
/*Disable global interrupts*/ 
__Gliseiolls ay 


/*Enable clock access for GPIOC*/ 
RCC- >AHB1ENR | =GPIOCEN; 


/*Set PC13 as input*/ 
GPIOC->MODER &=~(1U<<26) ; 
GPIOC-sMODER &=~(1U<<27) ; 


/*Enable clock access to SYSCFG*/ 
RCC- >APB2ENR | =SYSCFGEN; 


/*Select PORTC for EXTI13*/ 
SYSCFG- >EXTICR [3] |=(GlitieeS)) p 


/*Unmask EXTI13*/ 
EXTI->IMR |=(1U<<13); 


/*Select falling edge trigger*/ 
EXTI->FTSR |=(1U<<13); 


/*Enable EXTI13 line in NVIC*/ 
NVIC_EnableIRQ(EXTI15_ 10 IRQn) ; 


/*Enable global interrupts*/ 
— emolle Acc) y 


} 


Let’s break down each step within the pc13_exti_init function: 
__Glaljsyellohey_alseei(()) 5 


This line disables global interrupts to ensure that the configuration process is not interrupted, which 
is crucial for maintaining consistency and avoiding race conditions. 


RCC->AHB1ENR |= GPIOCEN; 
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This line enables the clock for GPIOC by setting the appropriate bit in the AHB1 peripheral clock 
enable register (AHB1ENR). 


GPIOC->MODER &= ~(1U<<26) ; 
GPIOC->MODER &= ~(1U<<27); 


These lines configure pin PC13 as an input by clearing the appropriate bits in the GPIO mode 
register (MODER). 


RCC- >APB2ENR le SYSCFGI 


|e] 


N; 


This line enables the clock for SYSCFG by setting the appropriate bit in the APB2 peripheral clock 
enable register (APB2ENR). SYSCFG is required for configuring the EXTI lines to map to the 
appropriate GPIO pins. 


SYSCFG->EXTICR[3] |= (1U<<5); 


This line configures the SYSCFG external interrupt configuration register to map EXTI line 13 to 
PORTC. The EXTICR [3] register controls EXTI lines 12 to 15, and setting the correct bits ensures 
that EXTI13 is connected to PC13. 


EXTI->IMR |= (1U<<13); 


This line unmasks EXTI line 13 by setting the appropriate bit in the interrupt mask register (IMR). 
Unmasking the line allows it to generate interrupt requests. 


EXTI->FTSR |= (1U<<13) ; 


This line sets EXTI line 13 to trigger on a falling edge by setting the appropriate bit in the FTSR. This 
configuration is essential for detecting when the signal transitions from high to low. 


NVIC_EnableIRQ(EXTI15_ 10 IRQn); 


This line enables the EXTI15_ 10 interrupt line in the NVIC. EXT] lines 10 to 15 share an IRQ in 
the NVIC, and enabling it allows the microcontroller to handle interrupts from these lines. 


__ texas ize) 5 


This line re-enables global interrupts after the configuration is complete, allowing the microcontroller 
to respond to interrupts. 


Our next task is to populate the gpio_exti .h file. 
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Here is the code: 


#ifndef GPIO EXTI H 
#define GPIO EXTI H _ 


#include <stdint.h> 
#include "stm32f4xx.h" 


#define LINE13 (1U<<13) 
void pcl13 exti_ init (void); 
#endif 


Now, let’s test the driver in the main. c file: 


#include <stdio.h> 
#include "adc.h" 
#include "uart.h" 
#include "gpio.h" 
#include "gpio exti.h" 


tiach © e otn regar 


int main (void) 


{ 


/*Initialize debug UART*/ 
viehe asai (()) A 


/*Initialize LED*/ 
led init (); 


/*Initialize EXTI*/ 
DEIJ Giri daie (()) p 


while (1) 


{ 


} 


static void exti_callback (void) 


{ 
printf ("BTN Pressed...\n\r") ; 
lodmrogguiel@)s; 
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} 


void EXTI15 10 IRQHandler(void) { 
if ((EXTI->PR & LINE13) !=0) 


{ 
/*Clear PR flag*/ 
EXTI->PR |=LINE13; 
//Do something... 
exti_callback(); 

} 


} 
Let’s break down the code: 


e The main function initializes the system components and then enters an infinite loop. 
e The exti_callback function is called when the external interrupt occurs. 


e EXTI15 10 IRQHand1ler handles interrupts for EXTI lines 10 to 15, including line 13 (PC13) 


if ((EXTI->PR & LINE13) != 0) 


The preceding line checks whether the pending bit for EXTI line 13 is set, indicating an interrupt 
has occurred. 


EXTI->PR |= LINE13; 


The preceding line clears the pending bit by writing a 1 to it, acknowledging the interrupt, and allowing 
it to be processed again. 


exti_ callback () ; 


This calls the ext i_callback function to handle the interrupt, which, in our case, prints a message 
and toggles the LED. 


To test on the microcontroller, simply build the project and run it. To generate the EXTI interrupt, 
press the blue push button. 


Open RealTerm and configure the appropriate port and baud rate to view the printed message that 
confirms the EXTI interrupts occur when the blue push button is pressed. 
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Summary 


In this chapter, we learned about the important role of interrupts in embedded systems development. 
Interrupts are essential for creating responsive and efficient firmware, allowing microcontrollers to handle 
real-time events effectively. By mastering the concepts of interrupts, you can develop systems that react 
promptly to external stimuli, enhancing the robustness and versatility of your embedded applications. 


We started by exploring the fundamental role of interrupts in firmware, comparing them with 
exceptions to highlight their unique purposes and handling mechanisms. We then examined the 
specifics of the ISR, the IVT, and the NVIC, which together form the backbone of interrupt handling 
in Arm Cortex-M microcontrollers. 


Next, we focused on the STM32 EXTI controller, a vital peripheral for managing external interrupts 
in STM32 microcontrollers. We discussed the key features and registers of the EXTI, and how to 
configure and utilize it for various applications. 


Finally, we applied this knowledge by developing an EXTI driver, providing practical experience in 
implementing interrupt-driven firmware. 


In the next chapter, we will learn about the Realtime Clock (RTC) peripheral. 


15 
The Real-Time Clock (RTC) 


In this chapter, we will explore the Real-Time Clock (RTC) peripheral, an essential component for 
timekeeping in embedded systems. This peripheral is crucial for applications that require accurate 
time and date maintenance, making it fundamental for a wide range of embedded applications. 


We will start by introducing RTCs and understanding how they function. Following this, we will delve 
into the STM32 RTC module, examining its features and capabilities. Next, we will analyze the relevant 
registers from the STM32 reference manual, providing a detailed understanding of the configuration 
and operation of the RTC. Finally, we will apply this knowledge to develop an RTC driver, enabling 
precise timekeeping in your embedded projects. 


In this chapter, we will cover the following main topics: 
e Understanding RTCs 
e The STM32 RTC module 
e Some key RTC registers 
e Developing the RTC driver 


By the end of this chapter, you will have a solid understanding of how RTCs work and will be equipped 
with the skills to develop bare-metal RTC drivers, allowing you to implement accurate timekeeping 
in your embedded systems projects. 


Technical requirements 


All the code examples for this chapter can be found on GitHub at https: //github.com/ 
Packt Publishing/Bare-Metal -Embedded-C- Programming. 


Understanding RTCs 


In this section, we will enter the world of RTCs, understanding what they are and how they work 
before exploring common use cases through a few interesting case studies. Let’s get started! 
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RTCs are specialized hardware devices found in many microcontrollers and embedded systems. Their 
primary function is to keep track of the current time and date, even when the main power supply is 
turned off. Imagine them as the little timekeepers of the digital world, ensuring that the clock never 
stops ticking, no matter what. 


RTCs are crucial in applications where timekeeping is essential. This includes everything from simple 
alarm clocks to complex data logging systems, where accurate timestamps are necessary. An RTC 
continues to operate on a small battery when the main system is powered down, maintaining accurate 
time and date information. Let’s see how they work. 


How do RTCs work? 


At the heart of an RTC is a crystal oscillator, which provides a stable clock signal. This oscillator 
typically runs at 32.768 kHz, a frequency chosen because it is easily divisible by powers of two, making 
it convenient for binary counting. 


Here’s a simplified breakdown of how an RTC works: 


e Crystal oscillator: The RTC contains a crystal oscillator that generates a precise clock signal. 


e Counter: This clock signal drives a counter. The counter increments at a rate determined by 
the oscillator’s frequency. 


¢ Time and date registers: The counter’s value is used to update time and date registers, which 
hold the current time (hours, minutes, seconds) and date (day, month, year). 


e Battery backup: To ensure continuous operation, RTCs often have a battery backup. This keeps 
the oscillator running and the counter active even when the main power is off. 


Let’s consider some common use cases for RTCs. 


Common use cases for RTCs 


RTCs are incredibly versatile and are used in a wide variety of applications. Let's explore some of the 
common use cases through a few case studies. 


Case study 1 - data logging 


One of the most common applications of RTCs is in data logging. Imagine that you're designing a 
weather station that collects temperature, humidity, and pressure data. Accurate timestamps are crucial 
for analyzing trends and patterns over time. Here’s how an RTC plays a vital role in this scenario: 


e Initialization: The RTC is initialized and set to the current time and date 


e Data collection: Every time a sensor reading is taken, the RTC provides a timestamp 


Understanding RTCs 


e Storage: The sensor data, along with the timestamp, is stored in memory 


e Analysis: When the data is retrieved for analysis, the timestamps ensure that each reading can 
be accurately placed on a timeline 


In this case, the RTC ensures that every piece of data is accurately timestamped, making it possible 
to track changes and trends with precision. 


Case study 2 - alarm clocks 


RTCs are also fundamental in designing alarm clocks. Be it a simple bedside alarm clock or a complex 
scheduling system, the RTC provides the accurate timekeeping needed to trigger events at the right 
moment. Let's look at a typical alarm clock scenario: 


e Timekeeping: The RTC keeps track of the current time, continuously updating the time registers. 


e Alarm setting: The user sets an alarm for a specific time. This information is stored in the 
RTC alarm registers. 


e Alarm trigger: When the RTC time matches the alarm time, an interrupt is triggered, activating 
the alarm mechanism (such as sounding a buzzer or turning on a light). 


In this case, the RTC ensures that the alarm goes off at the precise time set by the user, making it an 
essential component for reliable time-based alerts. At this point, your next question might be, “What's 
so special about RTCs?” 


Why are RTCs important? 


You might be wondering why we can't just use the system clock or general-purpose timers for 
timekeeping. The answer lies in the RTC’s ability to keep accurate time, even when the main system 
is powered down. Here are some key reasons why RTCs are indispensable: 


e Accuracy: RTCs use crystal oscillators, which provide highly accurate timekeeping 


e Low power consumption: RTCs are designed to operate on very low power, often running 
for years on a small battery 


e Battery backup: RTCs continue to keep time even when the main power is off, thanks to their 
battery backup 


e Independence from the main system: RTCs operate independently of the main microcontroller, 
ensuring continuous timekeeping 


By understanding how RTCs work and their common use cases, we can appreciate their importance 
and effectively incorporate them into our embedded projects. Whether you're building a simple 
alarm clock or a complex data logging system, the RTC is an important component that ensures your 
system always knows the right time. In the next section, we will explore the RTC peripheral in our 
STM32F411 microcontroller. 
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The STM32 RTC module 


In this section, we will explore the RTC module in the STM32F4 microcontroller family. Let’s start 
by looking at its features. 


The main features of the STM32F4 RTC module 


The STM32F4 RTC module is like the Swiss Army knife of timekeeping, offering a rich set of features 
designed to meet the needs of numerous applications. Here are some of the standout features: 


e Calendar with sub-seconds: The RTC module doesn't just keep track of hours, minutes, and 
seconds; it also maintains sub-second accuracy. This is particularly useful for applications that 
require precise time measurements. 


e Alarm functionality: Imagine that you have two alarm clocks within your microcontroller. The 
STM32F4 RTC module provides two programmable alarms, Alarm A and Alarm B, which can 
trigger events at specific times. This is perfect for tasks that need to be performed at regular 
intervals or at a specific time of day. 


e Low power consumption: One of the biggest advantages of the RTC module is its low power 
usage. This makes it ideal for battery-operated devices, where conserving power is paramount. 


e Backup domain: The RTC can operate independently of the main power supply thanks to 
a backup battery. This means that even if your device loses power, the RTC keeps running, 
maintaining accurate time. 


e Daylight saving time: With the RTC module, you can program adjustments for daylight saving 
time automatically. No more manual resets twice a year! 


e Automatic wakeup: The RTC can generate periodic wakeup signals, bringing your system out 
of low-power modes at preset intervals. This feature is invaluable for applications that need to 
perform regular checks or updates. 


e Tamper detection: Security is a critical aspect of many applications, and the RTC module 
has you covered with tamper detection. It can log tamper events, providing an added layer of 
security for your system. 


e Digital calibration: Accuracy is king when it comes to timekeeping. The RTC module includes 
a digital calibration feature to compensate for deviations in the crystal oscillator frequency, 
ensuring your timekeeping remains spot-on. 


e Synchronization with external clocks: To enhance precision, the RTC can synchronize with 
an external clock source. This is great for applications that need to maintain very high accuracy 
over long periods. 


Now, let’s analyze some of the key components of the STM32F4 RTC module. 
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The key components of the STM32F4 RTC module 


Let’s take a closer look at the key components of the RTC module in the STM32F4 microcontroller 
family. We’ll break down each part to understand how they work together to provide accurate 
timekeeping and versatile functionality, starting with the clock sources. 


Clock sources 


The driver of the RTC module is its clock sources. Figure 15.1 presents a detailed block diagram of the 
RTC module, highlighting the RTC clock sources. This diagram, sourced from the reference manual, 


provides a clear visual representation of the various components and their interactions within the 
RTC module: 
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Figure 15.1: RTC block diagram with clock sources highlighted 


The STM32F4 RTC can use multiple clock sources: 


e Low-speed external (LSE): A 32.768 kHz crystal oscillator known for its stability and low power 
consumption. This is typically the preferred clock source for accurate timekeeping. 


e Low-speed internal (LSI): An internal RC oscillator that provides a less accurate but convenient 
option when an external crystal is not available. 


e High-speed external (HSI): A high-speed clock source that can be used but is less common 
for RTC applications due to its higher power consumption. 
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The selected clock source feeds into the RTC’s prescalers, which are responsible for dividing the clock 
frequency into suitable levels for timekeeping: 
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Figure 15.2: RTC block diagram - asynchronous and synchronous prescalers 


Prescalers 
The RTC module employs two types of prescalers: 


e Asynchronous prescaler: This prescaler, typically set to divide by 128, reduces the clock 
frequency to a lower rate that can be managed by the synchronous prescaler. It helps balance 
power consumption and accuracy. 


e Synchronous prescaler: Often configured to divide by 256, this prescaler further reduces the 
clock frequency to generate a precise 1 Hz clock, which is essential for updating the time and 
date registers accurately. 


These prescalers ensure the RTC can operate efficiently, providing the necessary timekeeping precision 
while conserving power. Next, we have the time and date registers. 


The STM32 RTC module 


Time and date registers 


Figure 15.3 highlights the time and date registers of the RTC block: 
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Figure 15.3: RTC block diagram - time and date registers 


At the core of the RTC’s functionality are the time and date registers: 


e Time register (RTC_TR): This register holds the current time in hours, minutes, and seconds, 
stored in Binary-Coded Decimal (BCD) format. It is updated every second by the 1 Hz clock 
from the prescalers. 


e Date register (RTC_DR): This register maintains the current date, including the year, month, 
and day, also in BCD format. 


These registers are crucial for maintaining accurate time and date information, which can be read and 
adjusted as needed. The next key component is the RTC alarm. 
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Alarms 


The RTC module features two programmable alarms, Alarm A and Alarm B. These alarms can be set 
to trigger at specific times, providing a powerful tool for scheduling tasks: 


e Alarm registers: Each alarm has its own set of registers (RTC_ALRMAR and RTC_ALRMBR) 
to store the alarm time and date. 
e Interrupts: When an alarm is triggered, it can generate an interrupt, waking up the microcontroller 


from a low-power state or initiating a specific function. 


The alarm modules are indicated in the following figure: 
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Figure 15.4: RTC block diagram - alarms 


Next, we have the wakeup timer. 


Wakeup timer 


Another key feature of the RTC module is the wakeup timer, which is managed by the RTC_WUTR register: 
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Figure 15.5: RTC block diagram - wakeup timer 


This 16-bit auto-reload timer can generate periodic wakeup events, bringing the system out of 
low-power modes at regular intervals. It’s ideal for tasks such as sensor readings or system checks, 
ensuring efficient power usage. 


There's also the tamper detection module. Let’s take a look. 
Tamper detection 


Security is a vital aspect of many applications, and the RTC module includes tamper detection features. 
The tamper detection circuitry can log events when a tamper attempt is detected, using the timestamp 
registers to record the exact time and date. This adds an extra layer of security, especially in applications 
requiring reliable timekeeping and event logging. Next, we have the calibration register features. 


Calibration and synchronization 
To maintain high accuracy, the RTC module includes calibration features: 


e Digital calibration: The RTC_CALR register allows for fine adjustments to the clock frequency, 
compensating for any deviations in the crystal oscillator 


e External clock synchronization: The RTC can synchronize with an external clock source, enhancing 
accuracy by periodically adjusting the internal clock so that it matches the external reference 
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These features ensure the RTC maintains precise timekeeping, even in varying environmental conditions. 
We also have the backup and control registers module. 


Backup and control registers 
The RTC module includes several backup and control registers: 


e Backup registers: These registers store critical data that must be retained even when the main 
power supply is off 


e Control registers: These registers (such as RTC_CR) manage the configuration and operation 
of the RTC, including enabling the clock, setting alarms, and configuring wakeup events 


The backup and control registers module is indicated in the following figure: 
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Figure 15.6: RTC block diagram - backup and control registers 


Finally, there is the output control block. 


Output control 


The RTC module can output specific signals, such as a calibration clock or alarm outputs, through the 
RTC_AF1 pin. This allows the RTC module to interact with other components or systems, providing 
synchronized signals or triggering external events. 


In the next section, we will analyze some of the key registers for configuring the RTC peripheral. 


Some key RTC registers 


Some key RTC registers 


In this section, we will explore the characteristics and functions of some of the important registers 
within the RTC module. These registers are the building blocks that allow us to configure, control, 
and utilize the RTC’s features effectively. Let’s start with the RTC Time Register (RTC_ TR). 


RTC Time Register (RTC_TR) 


The RTC_TR register is responsible for keeping track of the current time. It maintains the hours, 
minutes, and seconds in BCD format, ensuring that time is easily readable and manipulable. Here 
are some of the key fields in this register: 


¢ Hour tens (HT) and hour units (HU): These bits represent the tens and units of the hour, 
respectively. They can handle both 24-hour and 12-hour formats. 


¢ Minute tens (MNT) and minute units (MNU): These bits represent the tens and units of 
the minutes. 


e Second tens (ST) and second units (SU): These bits represent the tens and units of the seconds. 


e PM: This bit indicates the AM/PM notation when in 12-hour format. 


Further information about this register can be found on page 450 of the reference manual. Let’s move 
on to the RTC Date Register (RTC_DR). 


RTC Date Register (RTC_DR) 


The RTC_DR register is responsible for maintaining the current date. It keeps track of the year, month, 
day of the month, and day of the week, all in BCD format. 


The following are the key fields in this register: 


e Year tens (YT) and year units YU): These bits represent the tens and units of the year 
e Month tens (MT) and month units (MU): These bits represent the tens and units of the month 


e Date tens (DT) and date units (DU): These bits represent the tens and units of the day of 
the month 


e Week day units (WDU): This bit represents the day of the week (1 to 7) 


You can read more about this register on page 451 of the reference manual. The next crucial register 
is the RTC Control Register (RTC_CR). 
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RTC Control Register (RTC_CR) 


The RTC_CR register is where we control the various operational modes and features of the RTC. This 
register allows us to enable the RTC, configure alarms, and set up the wakeup timer. 


Let’s consider the key bits in this register: 
e WUTE: Enable the wakeup timer. This bit enables the RTC wakeup timer. 
e TSE: Enable a timestamp event. This bit enables the timestamping of events. 
e ALRAEand ALRBE: Enable Alarm A and Alarm B. These bits enable the respective alarms. 
e DCE: Enable digital calibration. This bit enables digital calibration of the RTC clock. 


e FMT: Hour format. This bit sets the hour format to either 24-hour or 12-hour (AM/PM). 


Further details about this register can be found on page 453 of the reference manual. Next, we have 
the RTC Initialization and Status Register (RTC_ISR). 


RTC Initialization and Status Register (RTC_ISR) 


The RTC_ISR register plays a dual role in both initializing the RTC and monitoring its status. This 
register is crucial during the setup process and for checking the RTC’s current state. Here are the key 
bits in this register: 

e INIT: Initialization mode. Setting this bit puts RTC into initialization mode. 

e RSF: Registers synchronization flag. This bit indicates that the calendar registers are synchronized. 

e INITS: Initialization status flag. This bit indicates whether the RTC calendar has been initialized. 


e ALRAF and ALRBF: Alarm A and Alarm B flags. These bits indicate whether an alarm has 
been triggered. 


Next, we have the RTC Prescaler Register (RTC_PRER). 


RTC Prescaler Register (RTC_PRER) 


The RTC_PRER register manages the prescalers that divide the RTC clock source to produce the 1 
Hz clock necessary for accurate timekeeping. There are two key fields in this register: 


e PREDIV_A: Asynchronous prescaler. This field sets the value for the asynchronous prescaler. 
e PREDIV_S: Synchronous prescaler. This field sets the value for the synchronous prescaler. 
Configuring the RTC_PRER register properly is vital for maintaining the accuracy of the RTC. 


Next, we'll look at the RTC Alarm Registers, RTC_ALRMAR and RTC_ALRMBR. 
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RTC Alarm Registers (RTC_ALRMAR and RTC_ALRMBR) 


These registers handle the configuration of Alarms A and B. They allow us to set specific times when 
the alarms should trigger. Here are the key fields in these registers: 


e ALRMASK: Alarm mask bits. These bits allow you to mask certain parts of the alarm time, 
providing flexibility in how and when the alarms trigger. 
e ALRH, ALRMN, and ALRS: Hour, minute, and second fields. These fields set the specific 


time for the alarm. 


The RTC_ALRMAR and RTC_ALRMBR registers are vital for applications requiring reliable, time-based 
event triggering. Lastly, let’s explore the RTC Wakeup Timer Register (RTC_WUTR). 


RTC Wakeup Timer Register (RTC_WUTR) 


The RTC_WUTR register configures the wakeup timer, enabling the RTC to periodically wake the 
system from low-power modes. The key field in this register is the wakeup auto-reload value (WUT). 
This field sets the interval for the wakeup timer. 


In the next section, we will apply everything we've learned in this section to develop a driver for the 
RTC peripheral. 


Developing the RTC driver 


In this section, we will develop the RTC calendar driver so that we can configure and keep track of 
the time and date. 


As always, we will create a copy of our previous project while following the steps outlined in earlier 
chapters. We rename this copied project to RTC_Calendar. Next, create a new file named rtc.c 
in the Src folder and another file named rtc .h in the Inc folder. The RTC configuration can be 
quite elaborate, so we will create several helper functions to modularize the initialization process. 


The RTC implementation file 


Let’s begin by populating the rtc . c file, starting with the helper functions necessary for the initialization 
function. Here are the macro definitions that we will use in the RTC configuration: 


iuiMebyele Wretefe) jail! 

#define PWREN (1U << 28) 
#define CR_DBP (ly << B) 
#define CSR_LSION CU sa 0) 
#define CSR _LSIRDY CWU ee ak) 
#define BDCR_BDRST (alo ec iL) 
#define BDCR_RTCEN (alo ce 15) 
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#define RTC WRITE PROTECTION KEY 1 ((uint8_t) 0xCAU) 
#define RTC WRITE PROTECTION KEY 2 ((uint8_t) 0x53U) 


#define RTC_INIT_MASK OxXFFFEFFFFU 
#define ISR_INITF (1U sa G)) 
#define WEEKDAY FRIDAY ((uint8 t) 0x05U) 
#define MONTH DECEMBER ((uint8 t) 0x12U) 
#define TIME FORMAT PM (HU a= 22) 
#define CR_FMT (WU 3) 
#define ISR_RSF (W0 zz 5) 


#define RTC _ ASYNCH_PREDIV ((uint32_t) Ox7F) 
( 


#define RTC_SYNCH_PREDIV 


(uint32_t) 0x00F9) 


Let’s break them down: 


PWREN (1U << 28): This macro enables the clock for the PWR module by setting bit 28 in 
the APB1 peripheral clock enable register. 


CR_DBP(1U << 8): This enables access to the backup domain by setting bit 8 in the PWR register. 


CSR_LSION(1U << 0): This macro enables the LSI oscillator by setting bit 0 in the Clock 
Control & Status Register. 


CSR_LSIRDY(1U << 1): This macro is used to read the state of the LST register. The 
LSIRDY bit is set to 1 when the LSI is stable and ready to be used. 


BDCR_BDRST(1U << 16): This macro forces a reset of the backup domain by setting bit 
16 in the Backup Domain Control Register. 


BDCR_RTCEN(1U << 15): This enables RTC by setting bit 15 in the Backup Domain 
Control Register. 


RTC_WRITE PROTECTION KEY _1((uint8_t) 0xCAU): This key is used to disable write 
protection on the RTC registers. 


RTC_WRITE PROTECTION KEY 2((uint8_t) 0x53U): This is the second key needed 
to disable write protection on the RTC registers. 


RTC_INIT MASK (OXFFFFFFFFU): This mask is used to enter initialization mode in the 
RTC peripheral. 


ISR_INITF(1U << 6): This bit in the ISR register indicates that the RTC peripheral is in 
initialization mode. 


WEEKDAY FRIDAY ((uint8_t) 0x05U): This macro is used to configure the weekday of 
the calendar to Friday. 
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MONTH DECEMBER ((uint8_t) 0x12U): This macro is used to configure the month of 
the calendar to December. 


TIME FORMAT PM(1U << 22): This macro sets the time format to PM in the 12-hour format. 


CR_FMT (1U << 6): This macro sets the hour format to 24-hour format in the RTC control 
register (RTC- >CR). 
ISR_RSF(1U << _ 5): This bit in the ISR register indicates that the RTC registers 


are synchronized. 


RTC_ASYNCH_PREDIV ((uint32_t) 0x7F): This value sets the asynchronous prescaler 
for the RTC peripheral and is used to divide the clock frequency. 


RTC_SYNCH_PREDIV((uint32_t) 0x00F9): This value sets the synchronous prescaler for 
the RTC peripheral and is used to further divide the clock frequency for timekeeping accuracy. 


Let’s examine the two functions responsible for setting the prescaler values for the RTC peripheral. 


First, we have rtc_set_asynch_prescaler: 


static void rtc_set_asynch prescaler(uint32_t AsynchPrescaler) 


{ 


} 


MODIFY_REG(RTC->PRER, RTC_PRER_PREDIV_A, AsynchPrescaler << RTC_ 
PRER_PREDIV_A Pos) ; 


This function sets the asynchronous prescaler value for the RTC peripheral. Let’s break it down: 


MODIFY_REG: This macro modifies specific bits in a register. It is defined in the stm32f£4xx.h 
header file. 


RTC->PRER: The PRER register of the RTC. 
RTC_PRER_PREDIV_A: The mask for the asynchronous prescaler bits in the PRER register. 


AsynchPrescaler << RTC_PRER_PREDIV_A Pos: This snippet shifts the 
AsynchPrescaler value to the correct position within the PRER register. 


In short, this function configures the asynchronous prescaler by updating the appropriate bits in the 
PRER register. Next, we have the function for setting the synchronous prescaler - that is, rtc_set_ 
synch _prescaler: 


BiceNesC! Weel sae _SSie_Shyaavelal_jopataysyeeulens Vints 2 ic Syaaelashqesierlere)) 


{ 
} 


MODIFY REG(RTC->PRER, RTC _PRER_PREDIV_S, SynchPrescaler) ; 
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This function configures the synchronous prescaler by updating the appropriate bits in the PRER register: 


e RTC->PRER: The PRER register of the RTC 
e RTC _PRER_PREDIV_S: This is the mask for the synchronous prescaler bits in the PRER register 


e SynchPrescaler: This directly sets the synchronous prescaler value in the PRER register 


Next, we must analyze the other RTC initialization helper functions to understand how they work 
together to configure and synchronize the RTC peripheral. 


First, we have the rtc_enable_ init _mode function: 


void rtc enable init mode (void) 


{ 
} 


RTC->ISR = RTC_INIT MASK; 


Simply put, this function sets the RTC peripheral to initialization mode by writing the initialization 
mask to the ISR register: 


e RTC->ISR: This is the RTC_ISR register of the RTC 


e RTC_INIT_ MASK: This mask is used to enter initialization mode 
Next, we have _rtc_disable init mode: 


void _rtc_ disable init mode (void) 


| 
} 


RTC->ISR = ~RTC_INIT MASK; 


This function disables initialization mode by clearing the initialization mask in the ISR register. Here, 
~RTC_INIT_MASK clears the initialization mask, exiting initialization mode. 


The next helper function is rtc_isActiveflag init: 


uint8 t _rtc_isActiveflag_init (void) 


{ 
} 


ertan ((AIC=>1S9R Ge WS NRCS) S= SR NI) e 


This function returns 1 if the RTC peripheral is in initialization mode by checking the ISR_INITF 
bit, which indicates that the RTC peripheral is in initialization mode. 


Next, we have _rtc_isActiveflag_rs: 


uint8 t _rtc_isActiveflag_rs (void) 


{ 
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returni RTC E ESR SR ERS) ie —s el SRR SE) Ey; 


} 


This function returns 1 if the RTC registers are synchronized by checking the ISR_RSF bit, which 
indicates that the RTC registers are synchronized. 


There's also the rtc_init_seq function: 


Statics viime o cee dimli SECLE) 
/* Start init mode */ 
rtc enable init mode (); 


/* Wait till we are in init mode */ 
while (_rte_isActiveflag_init() != 1) {} 


return 1; 


} 


This function starts the RTC initialization by enabling initialization mode and waiting until the RTC 


peripheral enters initialization mode: 
e _rtc_enable_ init mode: This line puts the RTC peripheral into initialization mode 


e _xrtc_isActiveflag_init: This line waits until the RTC peripheral is in initialization mode 


Next, we have the wait_for_ synchro function: 


static uint8 t wait _for_synchro (void) 


{ 


/* Clear RSF */ 
RTC->ISR &= ~ISR_RSF; 


/* Wait for registers to synchronize */ 
while (_rte_isActiveflag_rs() != 1) {} 


return 1; 


} 


This function clears the synchronization flag and waits until the RTC registers are synchronized. 
We also have the exit_init seq function: 


gracie vinice i eeit tonie SEC Voie) 


{ 


/* Stop init mode */ 
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_rtc_disable init _mode() ; 


/* Wait for registers to synchronize */ 
return (wait for _synchro())j; 


} 


This function exits the RTC initialization mode and waits for the registers to synchronize to ensure 
everything is set up correctly. Now, let’s see the functions for configuring the date and time. 


First, we have rtc_date_config: 


pratile vorc gce Cete Gomt ie (balsics 7A ic WESIDEN/, UbinesA e Wey, Walahes\a_ic 
Month, uint32 t Year) 


{ 


register wintks2 tt temp = OU; 


temp = (WeekDay << RTC_DR_WDU Pos) |\ 

(Year & OxFOU) << (RTC_DR_YT Pos - 4U)) | 
Year & Ox0FU) << RTC DR YU Pos)) IN 

(Month & OxFOU) << (RTC_DR_MT Pos - 4U)) | 
Month & Ox0FU) << RTC_DR MU Pos)) |\ 

(Day & OxFOU) << (RTC_DR_DT Pos - 4U)) | 
Day & OxOFU) << RTC _DR_DU Pos)); 


( 
( 
( 
( 
( 
( 


MODIFY REG(RTC->DR, (RTC_DR_WDU | RTC_DR_ MT | RTC_DR_MU | RTC_DR DT 
| RIC wR Dy | RIC DR ea’ | aS wR wu), Cema) y 


This function sets the date in the RTC peripheral. 


We begin by creating a temporary variable called temp to hold the date value. This variable is carefully 
constructed by shifting and combining the weekday, day, month, and year into the appropriate positions 
for the RTC’s date register. The MODIFY _REG macro is then used to update the RTC’s date register 
with this new value. Here, we can have the following values: 


e WeekDay: Represents the day of the week 
e Day: Represents the day of the month 

e Month: Represents the month of the year 
e Year: Represents the year 


In essence, rtc_date_config takes the date components, assembles them into a single value, and 
writes it to the RTC’s date register, ensuring the RTC peripheral accurately tracks the current date. 
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We have the rtc_time_config function for configuring the time: 


SCENE! WAoaLC) See) eas! con se; (bakinjes A ie IstepgielelA 4), Wulisles\ ic isteybhats), 
uint32_ t Minutes, uint32_t Seconds) 


{ 


register uint32 t temp = OU; 


temp = Format12_24 |\ 

((Hours & OxFOU) << (RTC_TR_HT Pos - 4U)) | 
(Hours & OxOFU) << RTC_TR_HU Pos)) |\ 

( (Minutes & OxFOU) << (RTC_TR_MNT Pos - 4U)) | 
(Minutes & OxOFU) << RTC_TR_MNU_Pos)) |\ 
((Seconds & OxFOU) << (RTC_TR_ST Pos - 4U)) | 
( 


( 
( 
( 
( 
( 
((Seconds & Ox0FU) << RTC_TR_SU_Pos)) ; 


MODIFY REG(RTC->TR, (RTC_TR_PM | RTC_TR_HT | RTC_TR_HU | RTC_TR MNT 
| REC TR MNO | RIC wr Sw | KIC mr S0); tees) 5 


} 


This function sets the time in the RTC peripheral. Much like the date configuration, rtc_time_config 
begins by initializing a temporary variable, temp, to hold the time value. The time components - 
format, hours, minutes, and seconds - are then combined into this variable. The MODIFY REG macro 
updates the RTC’s time register with the newly constructed time value. Here, we have the following 


additional values: 
e Format12_24: This determines whether the time is in 12-hour or 24-hour format 
¢ Hours: Represents the hour value 
e Minutes: Represents the minute value 


e Seconds: Represents the second value 


Now that weve implemented all the helper functions required for initialization, lets go ahead and 
implement the rtc_init () function: 


void rtc_init (void) 
/* Enable clock access to PWR */ 
RCC->APB1ENR | = PWREN; 
/* Enable Backup access to config RTC */ 
PWR->CR |= CR_DBP; 


/* Enable Low Speed Internal (LSI) */ 
RCC->CSR |= CSR_LSION; 
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/* Wait for LSI to be ready */ 
while((RCC->CSR & CSR_LSIRDY) != CSR_LSIRDY) {} 


/* Force backup domain reset */ 
RCC->BDCR |= BDCR_BDRST; 


/* Release backup domain reset */ 
RCC->BDCR &= ~BDCR_BDRST; 


/* Set RTC clock source to LSI */ 
RCC->BDCR &= ~(1U << 8); 
RCC->BDCR |= (1U << 9); 


/* Enable the RTC */ 
RCC->BDCR |= BDCR_RTCEN; 


/* Disable RTC registers write protection */ 
RTC->WPR = RTC_WRITE PROTECTION KEY 1; 
RTC->WPR = RTC WRITE PROTECTION KEY 2; 


/* Enter the initialization mode */ 
dae ece ae igen) le al) 


{ 


// Handle initialization failure 


/* Set desired date: Friday, December 29th, 2016 */ 
rtc_date_config(WEEKDAY FRIDAY, 0x29, MONTH DECEMBER, 0x16); 


/* Set desired time: 11:59:55 PM */ 
rtc_time_config(TIME_FORMAT PM, 0x11, 0x59, 0x55); 


/* Set hour format */ 
RTC->CR |= CR_FMT; 


/* Set Asynchronous prescaler */ 
rtc_set_asynch_prescaler(RTC_ASYNCH_PREDIV) ; 


/* Set Synchronous prescaler */ 
HECESCE Synch prescallen (RIGS YNECHS PRED); 


/* Exit the initialization mode */ 


exit_init_seq(); 
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/* Enable RTC registers write protection */ 
RTC->WPR = OXFF; 


} 


Let’s break it down: 


RCC- >APB1ENR |= PWREN; 


We start by enabling the clock for the PWR module. This is crucial as it allows us to access and 
configure the RTC peripheral: 


PWR->CR |= CR_DBP; 


Next, we must enable access to the backup domain. This step is necessary to make changes to the 
RTC configuration: 


RCC->CSR |= CSR_LSION; 


Then, we enable the LSI oscillator, which serves as the clock source for the RTC peripheral: 


while ((RCC->CSR & CSR_LSIRDY) != CSR_LSIRDY) {} 


We must wait until the LSI oscillator is stable and ready to use: 


RCC->BDCR |= BDCR_BDRST; 


To ensure a clean configuration, we must force a reset of the backup domain: 


RCC->BDCR &= ~BDCR_BDRST; 


Then, we must release the reset, allowing the backup domain to function normally: 


RCC->BDCR &= ~(1U << 8); 
RCC->BDCR |= (1U << 9); 


After, we must configure the RTC peripheral so that it uses the LSI oscillator as its clock source: 


RCC->BDCR |= BDCR_RTCEN; 


Then, we must enable the RTC peripheral by setting the appropriate bit in the backup domain 
control register: 


RTC->WPR = RTC_WRITE PROTECTION KEY 1; 
RTC->WPR = RTC WRITE PROTECTION KEY 2; 
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To allow changes to the RTC registers, we must disable write protection: 


abie (age! baie eec) I= 1) 


{ 
} 


// Handle initialization failure 


We can enter initialization mode using our rtc_init_seq() helper function. If it fails, we must 
handle the error appropriately: 


rtc_date_config(WEEKDAY FRIDAY, 0x29, MONTH DECEMBER, 0x16) ; 


At this point, we must configure the RTC peripheral to the desired date using another one of our 
helper functions. In this example, we will set the date to Friday, December 29, 2016: 


rtc_time_config(TIME_FORMAT_PM, 0x11, 0x59, 0x55); 


T 


Then, we must set the RTC peripheral to the desired time. In this case, we will set the time to 11:59:55 PM.: 
RTC->CR |= CR_FMT; 
Next, we must configure the RTC peripheral so that it uses a 24-hour format: 


rtc_set_asynch_prescaler(RTC_ASYNCH_ PREDIV) ; 
rtc_set_synch prescaler(RTC_SYNCH_PREDIV) ; 


Then, we must set the asynchronous and synchronous prescaler values: 
eale mlie SEEN) y 


At this point, we can exit initialization mode using the exit_init_seq () helper function we 
created earlier: 


RTC->WPR = OXFF; 


Finally, we must re-enable write protection on the RTC registers to prevent accidental changes. 


This rtc_init function meticulously sets up the RTC peripheral by enabling the necessary clock, 
configuring the backup domain, setting the RTC clock source, and initializing the date and time. 


Before moving on to the main.c file, lets implement a few helper functions that are essential for 
handling various RTC tasks, such as converting values and getting the current date and time. 


Let’s start with the rtc_convert_dec2bcd function: 


Wine Sete secmCconvertadee2bed (uimt om e vewe) 


{ 
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return (uint8 t) ((((value) / 10U) << 4U) | ((value) % 10U)); 


} 


This function takes a decimal value and returns its equivalent in BCD format, which is useful for 
setting RTC values. 


Let’s take a closer look at decimal to BCD conversion: 
e First, ( (value) / 10U) << 4U shifts the tens digit to the left by 4 bits 
e Then, ((value) % 10U) gets the units digit 


e ‘The OR operation combines these two to form the BCD value 


Before going any further, let’s take a closer look at the BCD format and the conversion process. 


Understanding BCD format 


BCD is a way of representing decimal numbers in binary form. But here’s the twist: instead of 
converting the whole number into a single binary value, each decimal digit is represented by its own 
binary sequence. 


How does BCD work? 


In BCD, each digit of a decimal number is encoded separately as a 4-bit binary number. Let’s break it 
down with an example to make it clearer. 


Say you have the decimal number 42. In BCD, this would be represented as follows: 


e 4in decimal is 0100 in binary 


e 2in decimal is 0010 in binary 


So, 42 in BCD is 0100 0010. Notice how each decimal digit is converted into a 4-bit binary form and 
then combined to represent the entire number. 


Why use BCD? 


You might be wondering, why not just use regular binary? Well, BCD has its perks, especially in digital 
systems that need to display numbers or interface with human-readable formats: 


e Ease of conversion: Converting between BCD and decimal is straightforward. Each 4-bit group 
corresponds directly to a decimal digit, making it easy to read and convert. 


e Display compatibility: Devices such as digital clocks, calculators, and other numeric displays 
often use BCD because it simplifies the process of converting binary values into a form that 
can be easily shown on a screen. 
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Now, let’s see how this relates to RTCs. 
BCD in RTC configurations 


When working with RTCs, BCD is particularly handy. The RTC hardware often uses BCD to store 
time and date values because it simplifies how these values can be displayed and manipulated. For 
instance, setting the time to 12:34:56 in BCD means we have the following representations: 


e 12is 0001 0010 
e 34 is 0011 0100 
e 56 is 0101 0110 


Each of these pairs is easy to interpret and convert back into decimal for display or further processing. 


BCD format is a clever way of encoding decimal numbers in a binary system. By handling each decimal 
digit separately, BCD simplifies many operations, especially when interfacing with human-readable 
displays or systems that require precise decimal representation. Figure 15.6 illustrates how BCD values 
can easily be mapped onto digital displays: 


- Ee 
~ HEE 


10:32:17 


Figure 15.7: Display with BCD format 
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Now, lets analyze the rtc_convert_bcd2dec function: 


LLMs) ic wiES_Clofhience_loretlAclee: (Cime te, ves) 


{ 


return (uint8 t) (((uint8 t) ((value) & (uint8 _t)OxFOU) >> 
(uint8 t)O0x4U) * 10U + ((value) & (uint8_t)0Ox0FU)); 


} 


This function takes a BCD value and returns its decimal equivalent, making it easier to work with 
RTC data in a decimal format. 


Here’s the BCD to decimal conversion for this: 
e First, ( (value) & (uint8 t)OxFOU) >> (uint8 t) 0x4U extracts the tens digit 
e Then, ( (value) & (uint8 t)0x0OFU) extracts the units digit 
e Multiplication and addition combine these to form the binary value 

Next, we have a helper function for getting the day - that is, rtc_date_get_day: 


uint32 t rtc date get day (void) 


{ 


return (uint32_t) ((READ_BIT(RTC->DR, (RTC_DR_DT | RTC_DR_DU))) >> 
RTC_DR_DU_ Pos) ; 


} 


This function reads the RTC date register and returns the current day of the month. 


We can read the day by using READ_BIT(RTC->DR, (RTC_DR_DT | RTC_DR_DU)),which 
reads the day tens and units bits. 


Shifting the result to the right positions the value correctly. 
We also have a function for the year - that is, rt-c_date_get_year: 


uint32 t rtc date get year (void) 


{ 


return (uint32_t) ((READ_BIT(RTC->DR, (RTC_DR_YT | RTC_DR_YU))) >> 
RTC_DR_YU_Pos) ; 


} 


This function reads the RTC date register and returns the current year. 


Here, READ BIT(RTC->DR, (RTC_DR_YT | RTC_DR_YU) ) reads the year tens and units bits. 
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We also have functions for retrieving the month, second, minute, and hour, all of which are implemented 
using the same approach as the other getter functions: 


e First, we have rtc_date_get_month: 


uint32 t rtc date get month (void) 


{ 


return (uint32_t) ((READ_BIT(RTC->DR, (RTC_DR_MT | RTC_DR_ 
MU))) >> RTC_DR_MU_ Pos); 


} 


e Then, we have rtc_ time get second: 


uint32 t rtc time get second (void) 


{ 


return (uint32_t) (READ _BIT(RTC->TR, (RTC_TR_ST | RTC_TR_SU)) 
>> RTC_TR_SU_Pos) ; 


} 


e Next, theres rtc_ time get minute: 


uint32 t rtc time get minute (void) 


{ 


return (uint32_t) (READ _BIT(RTC->TR, (RTC_TR_MNT | RTC_TR_ 
MNU)) >> RTC_TR_MNU_ Pos); 


} 


e Finally, theres rtc_time_get_hour: 


uint32 t rtc time get hour (void) 


{ 


return (uint32_t) ((READ_BIT(RTC->TR, (RTC_TR_HT | RTC_TR_ 
HU))) >> RTC_TR_HU Pos); 


} 


With all of these functions implemented, our rtc . c file is complete. Our next task is to populate 
the rtc.h file. 


The header file 


Here’s the code: 


#ifndef RTC H _ 
#define RTC H _ 
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#include <stdint.h> 
#include "stm32f4xx.h" 


void rtc_init (void); 


vinee e CEG CONVerg OCLC imee ie velus) y 
uint32_t rtc date get _day (void); 

uint32 t rtc date get year (void); 

uint32 t rtc date get month (void); 

uint32 t rtc time get second (void); 

uint32 t rtc time get minute (void); 

uint32 t rtc time get hour (void); 

#endif 


Here, were simply exposing the functions implemented in rtc .c, making them callable from other 
files. Lets go ahead and test our driver’s main . c file. 


The main file 


Leťs update the main. file so that it looks like this: 


#include <stdio.h> 
smaLinelbyela Wie). law 


Hine iludermuare ia! 


#define BUFF_LEN 20 


uint8 t time buff [BUFF_LEN] 
uint8 t date buff [BUFF_LEN] 


oil 
~ m~ 
(si {Ss} 
Ww we 


static void display _rtc_ calendar (void) ; 


int main(void) 


{ 


/*Initialize debug UART*/ 
uaran tOr 


/*Initialize rtc*/ 


Iie! _aumalic ()) 5 


while(1) 
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display _rtc_calendar () ; 


static void display_rtc_calendar (void) 
{ 
/*Display format : hh : mm: ss*/ 
gorune ( (lasue e) Cume louie, Wa acl eee Acl sG.Aell” wee CONA 
bed2dee(rte (time get hous())); 
EECRCOnVentsbcd2dec Gaeta mengeraminuze())) 
rtc_convert_bcd2dec(rtc_time_get_second())) ; 


prineri ("Time = 2.20 3s.2d 2s j20d\n\r" te convert bed2dec (rteltimel 
gat mowie (O) 
rtc_convert_bcd2dec(rtc_time_get_minute()), 
BUCMCOnVierrt-mbCdAdec) (atu tamemgermsecond ())) is; 


/*Display format : mm: dd: yy*/ 
oprinne ichari date lotheie, Vs Acl = Goel = Gol! ace! Cletanicicic 
bed2deciieueEdavemcetamonteh (he, 
rtc_convert_bcd2dec(rtc_date_get_day()), 
Iie! _Clelahiteucie,_lovetolAGlere (Geiss! leleS jets Sveteie()) )) )) F 


oeme (DENCE! g SoA = GeAcl = aZ irec iconvert ibcdadeclntem 
date_get_month()), 

ile! (cKoraniencie, Ioyers Cle (ieee! Cline: Gjoie uday ODF 

cce CCnvert SECLE Chite Ger veer ()) )) y 


} 


Let’s break down the unique aspects of the code, starting with the display _rtc_calendar 
function. This function retrieves the current time and date from the RTC peripheral, formats these 
values, prints them out, and stores them in a buffer for further processing. 


The following are our buffer definitions: 


#define BUFF _LEN 20 


uint8 t time buff [BUFF _LEN] 
uint8 t date buff[BUFF_LEN] = {0}; 


ll 
~ 
=} 
— 
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Here, we can see the following: 


BUFF _LEN: This defines the length of the buffer for storing the time and date string 


¢ Buffers: 
* time_buff: This is an array that holds the formatted time string 


* date_buff: This is an array that holds the formatted date string 


Let’s take a closer look at the time formatting and display block: 


Sjoneabistese (((Claeue Jeme Johnie, Weal ewe! ea. Aci, 
aee _Cloavieucte_loyetolAclave:(reicie)_iealfina. Ger owie (()) )) , 
ice Clommneuete_loyeclaclere (Gece icaline Ces iliac. NDE 
IGE. CKopanSrare,_loxeOACIEVS (TEIEO)_iesliilS\_Cjeie_ Secreravel |) )) )) 7 


printr Aimer ik 52d) sis. 2dei12 2d \m\ El tcuconvertsbecd2dec (sieutamergetm 


lexoybine(()) )) 
reeRCoOnventibed2decil rect imeige timinu OD 


rtc_convert_bcd2dec(rtc_time_get_second()))j; 
Here, we can see the following: 
sprintf: This is used to format the time string into time_buff 


rtc_convert_bcd2dec: This converts the BCD values from RTC into decimal 


rtc_time_get_hour,rtc_time_get_minute, and rtc_time_get_second: 
These retrieve the current hour, minute, and second from the RTC peripheral, respectively 


printf: This function is used to print formatted output to the serial port 
% . 2d: This format specifier means that the integer will be printed with at least 2 digits, padding 
with leading zeros if necessary 
Were now ready to test our RTC calendar driver on the microcontroller. We can test the project by 
following these steps: 

1. Compile and upload the project to your microcontroller. 

2. Launch RealTerm or your preferred serial terminal program. 


3. Configure the appropriate port and baud rate. 
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4. You should see the time and date values printed and updating in real time, as shown in Figure 15.7: 


fs RealTerm: Serial Capture Program 2.0.0.70 = Oo x 


Display Pott | Capture | Pins | Send | Echo Pot} 120 | 120-2 | 12CMise | Misc | \n| Clear Freeze! ?| 
Status 
Baud [115200 _> | Port [e x| [Open Spy! A Change Vv Connected 
Parity Data Bits) ;-Stop Bits Ono oe conrei E ian a 
i Si pi : = 7 I Recei : 117 
@ None || @ abas|| © 1 bit C 2bits eceive Xon Char CTS (8) 
; pag C 7 bits | -Hardware Flow Control [D Transmit Xoff Char: |19 DCD (1) 
ven ; = = 
C Mark Bbits | © None C RTS/CTS DSR (6) 
C Space | C Sbits|| C DTR/DSR C RS485-ts Ring (3) 
BREAK 
Error 


Char Count:265236 |CPS:20400 Port: 8 115200 8N1 None 


Figure 15.8: Expected results 


Summary 


In this chapter, we explored the RTC peripheral, a component for timekeeping in embedded systems. 
This peripheral is essential for applications requiring precise time and date maintenance, making it 
fundamental for a wide range of embedded applications. 


We began by introducing RTCs and understanding their functionality. This included a deep dive into 
how RTCs operate, which involved focusing on the crystal oscillator, counters, time and date registers, 
and the importance of battery backup. We illustrated these concepts with case studies, showcasing the 
practical use of RTCs in data logging, alarm clocks, time-stamping transactions, and calendar functions. 


Following this, we examined the STM32 RTC module, highlighting its key features and capabilities. 
We discussed the calendar in terms of sub-seconds accuracy, dual programmable alarms, low power 
consumption, backup domain, daylight saving time adjustments, automatic wakeup, tamper detection, 
digital calibration, and synchronization with external clocks. Each feature was detailed to show its 
application and importance in maintaining accurate timekeeping. 


Summary 


Next, we analyzed the relevant registers from the STM32 reference manual, providing a detailed 
understanding of the configuration and operation of the RTC. We covered the RTC_TR, RTC_DR, 
RTC_CR, RTC_ISR, RTC_PRER, RTC_ALRMAR, RTC_ALRMBR, and RTC_WUTR registers. Each 
register’s role and key fields were explained to ensure you have a comprehensive grasp of how they 
contribute to the RTC’s functionality. 


Finally, we applied this knowledge to develop an RTC driver. We walked through the steps to create 
and configure the RTC driver, starting with the initialization sequence and covering functions to set 
the date and time. We also implemented helper functions for converting values between decimal and 
BCD formats, as well as retrieving current time and date values from the RTC peripheral. 


In the next chapter, we will delve into another useful peripheral, expanding our knowledge and toolkit 
for embedded systems development. 
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In this chapter, we'll learn about the Independent Watchdog (IWDG) timer, a unique component 
for enhancing the reliability of embedded systems. IWDG is essential for monitoring the system's 
operation and ensuring it can recover from unexpected faults or malfunctions. 


We will begin by exploring the general concept of watchdog timers (WDTs) and their importance 
in embedded systems. Following this, we will examine how WDTs function and the unique features 
of IWDG. Next, we will focus specifically on the STM32 implementation of the IWDG, looking at its 
key registers and configuration. Finally, we will apply this knowledge to develop a bare-metal IWDG 
driver, providing practical code examples to solidify our understanding. 


In this chapter, we will cover the following main topics: 
e Understanding WDTs 
e Types of WDTs 
e The STM32 IWDG 
e Developing the IWDG driver 


By the end of this chapter, you will have a comprehensive understanding of IWDG timers and their 
critical role in embedded systems. You will also gain skills to develop and implement IWDG drivers 
for STM32 microcontrollers, ensuring your systems can maintain robustness and reliability. 


Technical requirements 


All code examples for this chapter can be found on GitHub at the following link: 


https://github.com/PacktPublishing/Bare-Metal -Embedded-C- Programming 
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Understanding WDTs 


WDTs are one of the unsung heroes of embedded systems. They quietly monitor the system's health, 
ensuring it can recover gracefully from unexpected hitches. Imagine them as vigilant guards, always 
on the lookout for system malfunctions, ready to reset the microcontroller if something goes wrong. 
In this section, we'll explore what WDTs are and how they function, and dive into some common use 
cases to illustrate their importance. 


What are WDTs? 


WDTs are like guardians for your microcontroller. Imagine youre using a device, and something goes 
wrong—a bug in the developer’s code causes an infinite loop, or a hardware glitch freezes the system. 
Without a watchdog, your device would be stuck, potentially causing significant problems, especially 
in critical applications such as medical devices or automotive systems. 


A WDT is a hardware or software timer that resets the system if the main program fails to reset the 
timer before it expires. It’s a simple yet powerful mechanism to ensure that your system can recover 
from unexpected issues. Let’s see how they work. 


How WDTs work 


Think of a WDT as an hourglass that you need to turn over regularly to prevent it from running out 
of sand. Here’s a step-by-step breakdown of how it works: 


1. Initialization: When your system starts, you initialize the WDT with a specific timeout period. 
This period is the maximum time your system can run without resetting the timer. 


2. Countdown: The WDT starts counting down from the set timeout value. If it reaches zero, it 
assumes something went wrong and triggers a system reset. 


3. Resetting the timer: Your main program needs to periodically reset the WDT before it reaches 
zero. This action is often called feeding the watchdog or kicking the dog. If the program is 
running correctly, it will continue to reset the timer, preventing a reset. 


4. System reset: If your program fails to reset the WDT in time—perhaps because it got stuck 
in an infinite loop or encountered a critical error—the WDT will expire and reset the system, 
bringing it back to a known state. 


Now that we understand the basics, let’s look at some real-world applications where WDTs play a 
crucial role. 


Understanding WDTs 


Common use cases 
Following is a list of some real-world applications. 
Industrial automation 


In industrial automation, reliability is paramount. Machines and processes need to run continuously 
and without failure. WDTs ensure that if a Programmable Logic Controller (PLC) or other control 
systems hang or crash, they can quickly recover. 


Example: Imagine a conveyor belt system in a manufacturing plant. The PLC controlling the conveyor 
belt has a WDT set to 1 second. If the PLC software fails to reset the watchdog within 1 second due 
to a software bug or external interference, the WDT will reset the PLC. This reset ensures that the 
conveyor belt can resume operation with minimal downtime, preventing potential production losses. 


Automotive systems 


Modern vehicles rely heavily on embedded systems for various functions, from engine control to 
infotainment. WDTs are vital in ensuring these systems operate reliably. 


Example: Consider an engine control unit (ECU) in a car. The ECU monitors and controls critical 
engine parameters. A WDT in the ECU might be set to 500 milliseconds. If the ECU software fails to 
reset the watchdog due to a fault, the WDT resets the ECU. This reset can prevent engine misbehavior, 
ensuring the vehicle operates safely. 


Medical devices 


In medical devices, WDTs can be life-saving. Devices such as pacemakers, infusion pumps, and patient 
monitors must operate without failure. 


Example: Take a patient monitor that tracks vital signs such as heart rate and blood pressure. The 
monitor’s software includes a WDT set to 2 seconds. If the software encounters a problem and fails 
to reset the watchdog, the device will reset. This reset ensures the monitor can quickly recover and 
continue providing accurate, real-time data, which is crucial for patient care. 


Consumer electronics 


Even in consumer electronics, WDTs help maintain system reliability and enhance user experience. 
Think of smartphones, smart home devices, and gaming consoles. 


Example: In a smart thermostat, the software manages temperature settings and connectivity. A WDT 
ensures that if the software freezes, the system resets and continues operating. This functionality prevents 
users from experiencing extended downtime, maintaining comfort and convenience in their homes. 
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These examples illustrate the crucial role that WDTs play in modern systems. When implementing 
WDTs, it’s essential to consider several key factors to ensure their effectiveness. Let’s see some of 
these factors. 


Practical considerations 
When implementing WDTs, you must consider the following: 


¢ Timeout period: Choose an appropriate timeout period based on your application’s needs. Too 
short, and you risk unnecessary resets; too long, and you might not recover quickly enough 
from faults. 


e Reset mechanism: Ensure that resetting the WDT (feeding the dog) is done in a part of your 
code that runs regularly and indicates the system is operating correctly. 


e Recovery strategy: Plan how your system should recover after a watchdog reset. Ensure critical 
data is preserved and the system returns to a safe state. 


e Testing: Thoroughly test your WDT implementation to ensure it behaves as expected under 
various fault conditions. 


Next, let’s see the types of WDTs available. 


Types of WDTs 


WDTSs can be categorized into several types based on their functionality and integration. Lets explore 
the most common types. 


Internal WDTs 


Internal WDTs are built into the microcontroller. They are a convenient option because they don't 
require additional external components. These timers are directly integrated into the microcontroller’s 
architecture and can be configured through software. 


They have the following features: 
e Integration: No need for external circuitry 


e Configuration: Typically configured using the microcontroller’s registers 


e Power: They can continue to operate in low-power modes, making them suitable for 
battery-powered applications 


Example use case: In a small IoT device, an internal WDT can monitor the microcontroller’s operation 
without adding extra hardware, ensuring the device can reset itself if it encounters an error. 


Understanding WDTs 


External WDTs 


External WDTs are separate components connected to the microcontroller. These timers provide 
additional flexibility and can be used when the internal WDT isn't sufficient or if redundancy is required. 


Here is a list of their features: 
e Flexibility: Can be chosen based on specific requirements (for example, longer timeout periods) 
e Redundancy: Adding an external watchdog provides an extra layer of safety 


e Independence: Operate independently of the microcontroller’s clock and power 


Example use case: In a critical automotive system, an external WDT can provide an additional safeguard, 
ensuring the system resets even if the internal timer fails. 


Windowed WDTs 


Windowed WDTs (WWDTs) add an extra layer of control by introducing a window period. The 
system must reset the timer within a specific window period; too early or too late resets the system. 
This prevents scenarios where the software gets stuck in a loop resetting the watchdog too frequently 
(which could mask a malfunction). 


Their features include the following: 
e Precision: Require the timer to be reset within a specific time window 
e Fault detection: Can detect both early and late watchdog resets, offering improved fault detection 


e Security: Enhance system security by ensuring the timer is reset at appropriate intervals 


Example use case: In a medical device, a WWDT ensures the control software operates correctly within 
defined time intervals, adding an extra layer of reliability. 


IWDGs 


IWDGs are designed to be robust and reliable. They run from a separate clock source, usually a low-speed 
internal (LSI) clock, and operate independently of the main system clock. This independence ensures 
they continue to function even if the main clock fails. 


Their features include the following: 
e Independence: Operate from a separate clock source 


e Robustness: Continue functioning even if the main system clock fails 


e Minimal configuration: Typically simple to configure and use 


Example use case: In an industrial control system (ICS), an IWDG ensures the system can recover 
from malfunctions, even if the main clock source is disrupted. 
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Selecting the appropriate WDT depends on several factors, including the criticality of the application, 
power constraints, and required reliability. 


Choosing the right WDT 
Here's a quick guide to help you choose the right WDT: 


e For low-power applications: Consider internal WDTs due to their integration and low 
power consumption 


e For high-reliability systems: Use external WDTs for redundancy 
e For applications requiring precise timing: WWDTs provide enhanced fault detection 


e For systems needing robust operation: IWDGs offer continued functionality even if the main 
clock fails 


Understanding the different types of WDTs and their features allows us to choose the right one for our 
applications. Whether it’s an internal WDT for simplicity, an external one for redundancy, a WWDT 
for precise control, or an IWDG for robustness, there’s a WDT suited for every need. 


In the upcoming section, we will delve into the IWDG embedded within the STM32F411 microcontroller, 
examining its features and how to leverage it for enhanced system reliability. 


The STM32 IWDG 


In this section, we'll analyze the STM32 IWDG module, exploring its main features and other relevant 
information to help you understand how to leverage this powerful feature in your embedded applications. 


STM32 microcontrollers feature two types of WDTs: the IWDG and the Window Watchdog (WWDG). 
Both are essential for detecting and correcting software malfunctions by initiating a system reset, but 
they each have unique characteristics and applications. 


The IWDG operates using a dedicated LSI clock, ensuring it continues to function even if the main 
system clock fails. This makes it highly reliable for applications that require continuous monitoring, 
regardless of the main clock’s state. In contrast, the WWDG derives its clock from the APB1 clock and 
features a configurable time window. The system must refresh the WWDG within this time window; 
failing to do so, either too early or too late, will trigger a system reset. 


The IWDG is best suited for applications needing an independent watchdog process with lower timing 
accuracy constraints, while the WWDG is ideal for applications requiring precise timing windows. 
Let’s see some key features of the IWDG. 


The STM32 IWDG 339 


Key features of the IWDG 
The IWDG in STM32 microcontrollers boasts several key features: 


e Free-running downcounter: The IWDG operates as a free-running downcounter, continuously 
counting down from a preset value 


¢ Independent clock source: It uses an independent resistor-capacitor (RC) oscillator, allowing 
it to function in low-power modes such as Standby and Stop 


e System reset on timeout: If the WDT is activated and the downcounter reaches zero (0x000), 
a system reset is triggered 


e Write access protection: To modify critical registers, a specific sequence of operations is 
required, ensuring protection against accidental or malicious modifications 


Let’s see how it works. 


How the IWDG works 


The IWDG module operates as an independent safeguard for the microcontroller, ensuring the system 
can recover from software malfunctions. Figure 16.1 presents a block diagram of the IWDG, sourced 
from the reference manual: 


Prescaler register | |Status register Reload register 


Key register 
IWDG_PR IWDG_SR IWDG_RLR IWDG_KR 
———d) Z- 


12-bit reload value 


LSI 
e 


(40 kHz)} prescaler 


12-bit downcounte IWDG reset 


VDD voltage domain 


Figure 16.1: IWDG block diagram 
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Let’s break down its functional blocks and how it operates: 


e Starting the IWDG: To start the IWDG, we write the value 0xCCCC to the Key Register 
(IWDG_KR). This action initiates the downcounter, which begins counting down from the 
reset value of OXFFF. 


e Preventing a reset: To prevent the WDT from reaching zero and triggering a reset, the counter 
must be periodically reloaded. This is done by writing the value OXAAAA to the IWDG_KR 
register, which reloads the counter with the value from the Reload Register (IWDG_RLR) and 
prevents a reset. 


e Hardware watchdog feature: If enabled, the IWDG is automatically activated at power-on. In 
this mode, the WDT will generate a reset unless the Key Register is written with the appropriate 
value before the counter reaches zero. 


e Register access protection: To modify the prescaler (IWDG_PR) and reload (IWDG_RLR) 
registers, we must temporarily disable write access protection by writing the code 0x5555 to the 
IWDG_KR register. After doing this, any changes to these registers must be made immediately; 
otherwise, access protection will be re-enabled. 


With this general overview of the IWDG block in mind, let’s analyze the key registers one by one. 


IWDG registers 


In this section, we will explore the characteristics and functions of some of the crucial registers within 
the IWDG peripheral. 


Let’s start with the IWDG Key Register (IWDG_KR). 
IWDG Key Register (IWDG_KR) 


The IWDG_KR register is a key register used to control the IWDG’s operations, including starting the 
watchdog, reloading the counter, and disabling write access to other registers. This register is pivotal 
in managing the IWDG’s functionality. 


Key operations in this register include the following: 


e OxCCCC: Start the IWDG. Writing this value to IWDG_KR starts the IWDG timer. 


e OxAAAA: Reload the counter. Writing this value reloads the IWDG counter, preventing it from 
reaching zero and triggering a system reset. 


e 0x5555: Disable write protection. This value allows modifications to the prescaler and 
reload registers. 


Next, let’s discuss the IWDG Prescaler Register (IWDG_PR). 
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IWDG Prescaler Register (IWDG_PR) 


The IWDG_PR register is used to set the prescaler value, which determines the frequency of the IWDG 
clock by dividing the LSI clock. Adjusting this register helps configure the countdown speed of the WDT. 


Key bits in this register is PR [2 : 0]: Prescaler value. These bits can be set to divide the LSI clock by 
4, 8, 16, 32, 64, 128, or 256, allowing flexibility in setting the WDT interval. 


Next, we move on to the IWDG Reload Register (IWDG_RLR). 
IWDG Reload Register (IWDG_RLR) 


The IWDG_RLR register defines the reload value for the IWDG counter. This value determines the 
timeout period before the watchdog triggers a system reset if not reloaded in time. 


The key field in this register is RL [11:0]; this means reload value. It is a 12-bit value that sets the 
counter’s reload value, which can range from 0x000 to OxFFE 


Finally, we examine the IWDG Status Register (IWDG_ SR). 
IWDG Status Register (IWDG_SR) 


The IWDG_SR register provides status information about the IWDG, indicating whether updates to 
the prescaler or reload registers are ongoing. 


Key bits in this register include the following: 


e PVU: Prescaler value update (PVU). This bit indicates that the prescaler value is being updated. 


e RVU: Reload value update (RVU). This bit indicates that the reload value is being updated. 


With a clear understanding of the IWDG’s functionality and its registers, we can now move on to the 
next section, where we will develop the IWDG driver. 


Developing the IWDG driver 


In this section, we'll use what we've learned so far in this chapter to develop the IWDG driver. 
Let’s start by setting up the project. 


Create a copy of your previous project in your IDE, following the steps outlined in earlier chapters. 
Rename this copied project to ITWDG. Next, create a new file named iwdg.c in the Src folder and 
another file named iwdg.h in the Inc folder. 
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The IWDG implementation file 


Populate your iwdg . c file with the following code: 


#include "iwdg.h" 


#define IWDG KEY ENABLE 0x0000CCCCU 
#define IWDG KEY WR ACCESS ENABLE 0x00005555U 
#define IWDG PRESCALER 4 0x00000000U 
#define IWDG RELOAD VAL OxFFF 


static uint8 t isIwdg ready (void); 


void iwdg init (void) 
/*Enable the IWDG by writing 0x0000CCCC in the IWDG KR register*/ 
IWDG->KR = IWDG_KEY ENABLE; 


/*Enable register access by writing 0x0000 5555 in the IWDG KR 
register*/ 
IWDG->KR = IWDG KEY WR ACCESS ENABLE; 


/*Set the IWDG Prescaler* / 
IWDG->PR = IWDG PRESCALER 4; 


/*Set the reload register (IWDG RLR) to the largest value OxFFF*/ 
IWDG->RLR = IWDG RELOAD VAL; 


/*Wait for the registers to be updated (IWDG_SR = 0x0000 0000) */ 
while (isIwdg ready() != 1){} 


/*Refresh the counter value with IWDG KR (IWDG KR = 0x0000 AAAA) */ 
IWDG->KR = IWDG_ KEY RELOAD; 


static uint8 t isIwdg ready (void) 
return ((READ BIT(IWDG->SR, IWDG_SR_PVU | IWDG SR_RVU) == 0U) ? 1UL 
OUL) ; 


} 
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Let’s break it down. Let’s look at the macro definitions first: 


#include "iwdg.h" 


#define IWDG KEY ENABLE 0x0000CCCCU 
#define IWDG KEY WR ACCESS ENABLE 0x00005555U 
#define IWDG PRESCALER 4 0x00000000U 
#define IWDG RELOAD VAL OxFFF 


e IWDG KEY ENABLE: This macro defines the key to enable the IWDG 


e IWDG KEY WR ACCESS ENABLE: This macro defines the key to enable write access to 
IWDG registers 


e IWDG PRESCALER_4: This macro defines the prescaler value for the IWDG 


e IWDG RELOAD VAL: This macro sets the reload value to the maximum, OxFFF, providing 
the longest timeout period 


Next, we have the iwdg_ init () function. 


IWDG->KR = IWDG KEY ENABLE; 


This line writes 0x0000CCCC to the IWDG Key Register (KR) to enable the IWDG. 


IWDG->KR = IWDG KEY WR ACCESS ENABLE; 


This line writes 0x00005555 to the IWDG Key Register to enable write access to the prescaler and 
reload registers. 


IWDG->PR = IWDG_ PRESCALER 4; 

This line sets the prescaler register to divide the clock by 4, as defined by IWDG_PRESCALER_4. 
IWDG->RLR = IWDG RELOAD VAL; 

This sets the reload register to the maximum value of OxFFF to get the longest possible timeout period. 
while (isIwdg ready() != 1) {} 


This waits until the IWDG status register (SR) indicates that the prescaler and reload registers have 
been updated and are ready. 


IWDG->KR = IWDG KEY RELOAD; 


This writes 0x0000AAAA to the IWDG Key Register to reload the counter and prevent the IWDG 
from resetting the system. 
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The isIwdg_ready () function checks the Status Register to see if the PVU and RVU bits are cleared. 
It returns 1 if both bits are cleared, indicating that the IWDG is ready, or 0 otherwise. 


We are now ready to test inside main.c. 


The main file 
Update your main . c file as shown next: 


#include <stdio.h> 
#include "adc.h" 
#include "uart.h" 
#include "gpio.h" 
#include "iwdg.h" 
#include "gpio exti.h" 


tiach ic Cj otn oresar 
static void check reset_source (void) ; 


int main(void) 


{ 


/*Initialize debug UART*/ 
tere ouie y 


/*Initialize LED*/ 
algeyel_strmalie () 9 


/*Initialize EXTI*/ 
DEIS eiet dmi (()) p 


/*Find reset source*/ 
checkmnresctmsouncel (Er 


/*Initialize IWDG*/ 
iwdg_init(); 
while (1) 


{ 


ie ej lda ees l= sas) 
{ 
/*Refresh IWDG down-counter to default value*/ 
IWDG->KR = IWDG_KEY_RELOAD; 
led_toggle(); 
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for(int i = 0; i < 90000; i++) {} 


} 


The main function kicks off by setting up the universal asynchronous receiver-transmitter (UART) 
for serial communication, ensuring that we can send debugging information to the serial port. Next, 
it initializes the LED and configures the external interrupt (EXTI) on pin PC13, which is connected 
to the blue push button. The function then calls check_reset_source to determine if the last 
reset was caused by the IWDG. After that, it initializes the IWDG itself, ensuring that the system can 
recover from software malfunctions. The main loop continually checks the state of g btn_press; 
if the button hasn't been pressed, it refreshes the IWDG to prevent a system reset and toggles the LED, 
providing a visual indicator of system activity. 


Let’s look at the next part of the code: 


static void check reset_source (void) 


{ 
if ((RCC->CSR & RCC_CSR_IWDGRSTF) == (RCC_CSR_IWDGRSTF) ) 
{ 
/*Clear IWDG Reset flag*/ 
RCC->CSR = RCC_CSR_RMVF; 
/*Turn LED On*/ 
edion 
printf ("RESET was caused by IWDG..... Wate? )) 9 
while e btn press! != 1) 
{ 
} 
Gj loin joes = Op 
} 
} 


The check_reset_ source function determines if the most recent system reset was triggered by 
the IWDG. It begins by checking the Reset and Clock Control (RCC) status register (CSR) for the 
IWDG reset flag (RCC_CSR_IWDGRSTF). If this flag is set, it confirms that the watchdog initiated 
the reset. The function then clears this flag by writing to the RCC_CSR_RMVF bit, ensuring the flag 
is reset for future detection. As a visual indication of the IWDG-triggered reset, the LED is turned 
on, and a message is printed to the UART. The function enters a loop, waiting for the user to press the 
button (detected by the g_btn_press variable) before clearing the button press state and exiting. 
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Then, there is the callback: 


static void exti_callback (void) 


{ 


g_btn_ press’ = 1; 


} 


This is followed by the interrupt handler: 


void EXTI15 10 IRQHandler(void) { 


if ((EXTI->PR & LINE13) !=0) 

{ 
/*Clear PR flag*/ 
EXTI->PR |=LINE13; 
//Do something... 
exti_callback(); 

} 


} 


The ext i_callback function is a simple yet vital part of our interrupt-handling mechanism. Its 
sole purpose is to set the g_btn_press flag to 1, indicating that a button press has been detected. 
This flag is later used in the main loop to control the program flow. The EXTI15_10_IRQHandler 
function is the interrupt handler for external interrupts on lines 10 to 15. When an interrupt is triggered 
on line 13, this handler first checks the pending register (PR) to confirm the interrupt source. Once 
verified, it clears the pending flag by writing back to the PR register. After clearing the interrupt, the 
handler calls ext i_callback to update the g_btn_press flag. 


Testing the project 
To test the project on the microcontroller, follow these steps: 


1. Build and run the project: Compile the code and upload it to your microcontroller. Once running, 
you should observe the green LED blinking, indicating that the system is functioning correctly. 


2. Monitor the serial output: Open RealTerm or any other serial terminal application. Configure 
it with the appropriate port and baud rate to view the debug messages. This will allow you to 
confirm when the system restarts due to the IWDG. 


3. Trigger a watchdog reset: Press the blue push button to stop the IWDG timer from being 
refreshed. After the IWDG timeout period elapses, the system will reset, and you should see a 
corresponding message in the serial terminal. 


Summary 


Summary 


In this chapter, we explored the IWDG timer, an important component for enhancing the reliability 
of embedded systems. We began by discussing the general concept of WDTs, emphasizing their role 
in ensuring systems can recover from unexpected faults or malfunctions. We explored how WDTs 
function and highlighted the unique features of the IWDG. 


Next, we focused on the STM32 implementation of the IWDG, examining its key registers and 
configuration options. We detailed the purpose and usage of essential registers such as the Key 
Register (IWDG_ KR), Prescaler Register (IWDG_PR), Reload Register (IWDG_RLR), and Status 
Register (IWDG_ SR). This provided a comprehensive understanding of how to configure and control 
the IWDG for robust system operation. 


We also provided practical examples to solidify our understanding, including the development of a bare- 
metal IWDG driver. This involved initializing the IWDG, configuring its prescaler and reload values, 
and implementing the necessary functions to ensure the system can recover from software malfunctions. 


In the next chapter, we will learn about the Direct Memory Access (DMA) module, an advanced 
feature for transferring data. 
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In this chapter, we will explore Direct Memory Access (DMA), a powerful feature in microcontrollers 
that allows peripherals to transfer data to and from memory without involving the CPU. This 
functionality is critical for improving data throughput and freeing up the CPU to handle other tasks, 
making it fundamental to high-performance embedded system development. 


We will begin by understanding the basic principles of DMA and its significance in embedded systems. 
We will then delve into the specifics of the DMA controller in STM32F4 microcontrollers, examining its 
structure and features and how it manages data transfers. Following this, we will apply this theoretical 
knowledge to develop practical DMA drivers for various use cases, including memory-to-memory 
transfers, Analog-to-Digital Converter (ADC) data transfers, and Universal Asynchronous Receiver- 
Transmitter (UART) communications. 


In this chapter, we will cover the following main topics: 


e An overview of DMA 

e The STM32F4 DMA 

e Developing the DMA ADC driver 
e Developing the DMA UART driver 


e Developing the DMA memory-to-memory driver 


By the end of this chapter, you will have a comprehensive understanding of how DMA works and 
how to implement it in your projects. You will be able to develop efficient DMA drivers to handle 
data transfers in various scenarios, significantly enhancing the performance and responsiveness of 
your embedded systems. 
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Technical requirements 


All the code examples for this chapter can be found on GitHub at https: //github.com/ 
Packt Publishing/Bare-Metal -Embedded-C- Programming. 


Understanding Direct Memory Access (DMA) 


DMA is a feature that can significantly elevate the performance of your embedded systems. If you’ve 
been dealing with data transfers in your microcontroller projects, you know how taxing it can be on 
the CPU to handle all that data movement. This is where DMA steps in as a game-changer, offloading 
the data transfer tasks from the CPU and allowing it to focus on more critical functions. Let’s see 
how it works. 


How DMA works 


So, what exactly is DMA, and how does it work? In simple terms, DMA is a method that allows 
peripherals within a microcontroller to transfer data directly to and from memory, without requiring 
continuous CPU intervention. Imagine it as a dedicated assistant that takes over the tedious task 
of moving boxes (data) so that you (the CPU) can focus on more important work, such as solving 
complex problems or managing other peripherals. 


A typical DMA controller in a microcontroller has multiple channels, each capable of handling a 
specific data transfer operation. Each channel can be configured independently to manage transfers 
between various peripherals and memory. 


Here’s a step-by-step look at how DMA generally operates: 


1. Initialization: The DMA controller and channels are configured. This setup includes specifying 
the source and destination addresses, the direction of data transfer, and the number of data 
units to transfer. 


2. Trigger: The data transfer is initiated by a trigger, which can be an event such as a peripheral 
signaling that it’s ready to send or receive data, or a software command. 


3. Data transfer: Once triggered, the DMA controller takes over, reading data from the source 
address and writing it to the destination address. This process continues until the specified 
number of data units is transferred. 


4. Completion: Upon completing the transfer, the DMA controller can generate an interrupt 
to notify the CPU that the transfer is done, allowing the system to perform any necessary 
post-transfer processing. 


Next, let’s take a look at some key features of DMA controllers. 
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Key features 


DMA controllers are packed with features that make them versatile and powerful. Let’s break down 
some of the key specifications you'll often encounter: 


Channels and streams: DMA controllers typically have multiple channels and streams, each 
capable of handling a different transfer. For instance, the STM32F4 microcontroller has up to 
16 streams in its DMA controllers. 


Priorities: Channels can be assigned different priority levels, ensuring that more critical transfers 
get precedence over less critical ones. 


Transfer Types: DMA can handle various types of transfers, including memory-to-memory, 
peripheral-to-memory, and memory-to-peripheral. 


FIFO: Some DMA controllers come with a First-In-First-Out (FIFO) buffer, which helps 
manage data flow and improve efficiency, especially in burst transfers. 


Circular mode: This mode allows the DMA to continuously transfer data in a loop, which is 
particularly useful for peripherals that need constant data streaming, such as audio or video feeds. 


Interrupts: DMA controllers can generate interrupts on transfer completion, half-transfer 
completion, and transfer errors, allowing the CPU to react appropriately to different stages 
of the transfer. 


To understand the real power of DMA, let’s look at some common use cases where DMA shines. 


Common use cases 


Here are some common use cases of DMA: 


Audio streaming: DMA is heavily used in audio applications where continuous data streaming 
is essential. For instance, in a digital audio player, the audio samples must be continuously sent 
to a Digital-to-Analog Converter (DAC). Using DMA, the audio data can be streamed from 
memory to the DAC without CPU intervention, ensuring smooth playback and freeing up the 
CPU to manage the user interface and other tasks. 


Sensor data acquisition: In applications such as environmental monitoring or industrial 
automation, sensors often need to sample data at precise intervals. For example, an ADC can 
be configured to continuously sample temperature data, with DMA transferring the sampled 
data directly to memory. This setup ensures that the CPU isn't bogged down with handling 
each individual sample, thus maintaining efficient and timely data collection. 
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e Communication interfaces: DMA is a lifesaver when dealing with high-speed communication 
protocols such as UART, SPI, or I2C. Consider a scenario where an embedded system needs 
to log data received over UART to an SD card. Without DMA, the CPU would need to handle 
each byte of data, process it, and then write it to the SD card, which can be highly inefficient. 
With DMA, the data received over UART can be directly written to memory, and another DMA 
channel can transfer it to the SD card, all with minimal CPU intervention. 


e Graphics processing: DMA is also crucial in applications involving graphics, such as updating 
a display buffer. In a system where the display needs to be refreshed continuously, the DMA 
can handle the transfer of image data from memory to the display controller. This ensures 
smooth and flicker-free graphics rendering, while the CPU can focus on generating the next 
frame or managing user inputs. 


With this in mind, let’s compare some DMA solutions to non-DMA solutions. 
Case study 1 - audio streaming 


Scenario: You are developing an audio playback system that streams digital audio data from a 
microcontroller to a DAC. 


Without DMA: The CPU is responsible for fetching each audio sample from memory and sending it to 
the DAC. Given the high sampling rate required for audio applications (e.g., 44.1 kHz for CD-quality 
audio), the CPU must handle tens of thousands of interrupts per second just to maintain the audio 
stream. This constant load significantly limits the CPU’s ability to perform other tasks, potentially 
leading to audio glitches and reduced system responsiveness. 


With DMA: The DMA controller is configured to transfer audio data directly from memory to the 
DAC. The CPU sets up the DMA transfer and then handles higher-level tasks, only occasionally 
checking the status of the transfer. This setup ensures smooth and uninterrupted audio playback while 
freeing up the CPU to manage other aspects of the system, such as user interface and control logic. 


Case study 2 - high-speed data acquisition 


Scenario: You are developing a data acquisition system that continuously samples data from multiple 
sensors via ADCs and stores the data for later analysis. 


Without DMA: The CPU must handle each ADC conversion, read the data, and store it in memory. If 
the sampling rate is high, the CPU can become overwhelmed, leading to missed samples and unreliable 
data collection. This approach can also complicate real-time data processing and analysis, as the CPU 
is bogged down with managing the data flow. 


The DMA modules of the STM32F4 microcontroller 


With DMA: The ADC is configured to generate DMA requests. Each time a conversion is complete, 
the DMA controller transfers the data from the ADC to memory without CPU intervention. The 
CPU can then process the collected data in batches, ensuring that no samples are missed and enabling 
real-time data analysis and decision-making. 


Case study 3 - an LCD display refresh 


Scenario: You are developing a graphical application that continuously updates an LCD display with 
new data. 


Without DMA: The CPU must update the display by sending each pixel or line of data directly to 
the LCD controller. This process can be very CPU- intensive, especially for high-resolution displays, 
leading to sluggish performance and reduced responsiveness in the user interface. 


With DMA: The DMA controller is configured to transfer display data from memory to the LCD 
controller. The CPU sets up the DMA transfer and then focuses on generating new graphical data or 
handling user inputs. The DMA controller ensures that the display is updated smoothly and efficiently. 


In each of the case studies, we've seen how using DMA can transform a system’s capabilities, freeing 
up the CPU to handle more critical tasks and ensuring that data transfers are handled efficiently and 
reliably. Understanding and implementing DMA in your projects can lead to more robust, responsive, 
and high-performance embedded systems. 


In the following section, we will delve deeper into the specifics of the STM32F4 DMA controller, 
exploring its architecture, key registers, and practical implementation techniques. 


The DMA modules of the STM32F4 microcontroller 


Each STM32F4 microcontroller is equipped with two DMA controllers, each supporting up to 8 
streams. Each stream can manage multiple requests, providing up to 16 streams in total to handle 
various data transfer tasks. A stream is a unidirectional pathway that facilitates data transfer between 
a source and a destination. The architecture includes an arbiter to prioritize these DMA requests, 
ensuring that high-priority transfers are handled promptly. 


Let’s see some key features of the STM32F4 DMA controller. 
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The key features of the STM32F4 DMA controller 


The following are the key features of the STM32F4 DMA controller: 


Independent FIFO: Each stream includes a four-word FIFO buffer, which can operate in either 
direct mode or FIFO mode. In direct mode, data transfers occur immediately upon request, while 
FIFO mode allows for threshold-level buffering, enhancing efficiency for burst data transfers. 


Flexible configuration: Each stream can be configured to handle the following: 
* Peripheral-to-memory 
* Memory-to-peripheral 


* Memory-to-memory transfers 


Additionally, streams can be set up for regular or double-buffer transfers, the latter enabling seamless 
data handling by swapping memory buffers automatically: 


Prioritization and arbitration: DMA stream requests are prioritized via software with four 
levels - very high, high, medium, and low. If two streams have the same priority, hardware 
prioritization based on the stream number ensures orderly data transfer. 


Incremental and burst transfers: The DMA controller supports both incremental and 
non-incremental addressing for source and destination. It can manage burst transfers of 4, 8, 
or 16 beats, optimizing bandwidth usage. The term beat refers to the individual units of data 
that are transferred in a single DMA transaction. 


Interrupts and error handling: Each stream supports multiple event flags such as transfer 
complete, half-transfer, transfer error, FIFO error, and direct mode error. These flags can trigger 
interrupts, providing robust error handling and status monitoring. 


To fully utilize the STM32F DMA controller, it’s important to understand DMA transactions and 
channel selection. Let’s break these concepts down. 


DMA transactions: A typical DMA transaction involves the following: 


" Loading data from the source address (either a peripheral or memory) 

Storing data to the destination address 

" Decrementing the DMA_SxNDTR register to track the number of remaining data items 
Channel selection: Each stream can be associated with up to eight different channels, selectable 


via the CHSEL bits in the DMA_SxCR register. This flexibility allows various peripherals to 
initiate DMA requests efficiently. 


The DMA modules of the STM32F4 microcontroller 


Previously, we discussed that the STM32F4 DMA controller supports three distinct transfer modes. 
Now, let’s explore the characteristics of each mode. 


Transfer modes 
There are three transfer modes: 


e Peripheral-to-memory mode: This mode is activated by setting the EN bit in the DMA_SxCR 
register. The stream transfers data from the peripheral to memory, using the FIFO buffer if enabled. 


e Memory-to-peripheral mode: This is similar to peripheral-to-memory, but the transfer direction 
is reversed. Data is loaded from memory and sent to the peripheral. 


e Memory-to-memory mode: This mode is unique, as it does not require peripheral requests. The 
DMA stream transfers data between two memory locations, using the FIFO buffer if enabled. 
This is particularly useful for large data transfers within memory. 


The DMA controller can automatically increment source and destination pointers, facilitating efficient 
data transfers across different memory regions. This is configurable via the PINC and MINC bits in 
the DMA_SxCR register. 


The STM32F4 DMA controller also provides data mode options. 


DMA data modes 


This data mode options include the following: 


e Circular mode: Circular mode allows the DMA to handle continuous data flows by automatically 
reloading the initial values in the DMA_SxNDTR register. This is especially useful for applications 
such as ADC sampling, where data needs to be continuously recorded. 


e Double buffer mode: Double buffer mode enhances efficiency by allowing the DMA controller 
to swap between two memory buffers automatically. This ensures continuous data processing. 
While the CPU works on one buffer, the DMA can load the next set of data into the other buffer. 


In the next section, we will examine the STM32F4 DMA block diagram from the reference manual. 
This will help us better understand the key characteristics and functionalities of the DMA controller. 
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The STM32F4 DMA block diagram 


The DMA has two ports for data transfer - one peripheral port and one memory port, as shown in 
Figure 17.1. 
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Figure 17.1: The DMA module, indicating the data transfer ports 
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Each of the two DMA modules features eight distinct streams, with each stream dedicated to handling 
memory access requests from various peripherals. 
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Figure 17.2: The DMA module, indicating the streams 
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Each stream can accommodate up to eight selectable channels, which are software-configurable to 
enable multiple peripherals to initiate DMA requests. However, within any given stream, only one 
channel can be active at a time. 
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Figure 17.3: The DMA module, with channels zoomed in 


To find the mappings of DMA channels and streams to the various peripherals of the microcontroller, 
refer to page 170 of the reference manual (RM0383). 
Before we start developing our DMA drivers, the final piece of the puzzle involves familiarizing 


ourselves with the key DMA registers. 


The key STM32 DMA registers 


In this section, we will explore the characteristics and functions of some of the crucial registers within 
the DMA peripheral, starting with the DMA Stream Configuration Register. 


The DMA modules of the STM32F4 microcontroller 


The DMA Stream Configuration Register (DMA_SxCR) 


The DMA Stream Configuration Register (DMA_SxCR) is one of the primary registers used to configure 
a DMA stream’s operational settings. This register allows us to set up various parameters, such as the 
data direction, the size of the data items, and the priority level of the stream. The key bits in this 
register include the following: 

e EN: Stream enable. Setting this bit activates the stream. 


e CHSEL[2:0]: Channel selection. These bits select the DMA channel for the stream. 


e DIR[1:0]: Data transfer direction. These bits specify the direction of the data transfer (peripheral- 
to-memory, memory-to-peripheral, or memory-to-memory). 


e CIRC: Circular mode. Setting this bit enables circular mode, which allows continuous data transfers. 


e PINC: Peripheral increment mode. When set, this bit increments the peripheral address after 
each data transfer. 


e MINC: Memory increment mode. When set, this bit increments the memory address after 
each data transfer. 


e PSIZE[1:0]: Peripheral data size. These bits specify the size of the data items read from or 
written to the peripheral (8-bit, 16-bit, or 32-bit). 
e MSIZE[1:0]: Memory data size. These bits specify the size of the data items read from or 


written to memory. 


e PL[1:0]: Priority level. These bits set the priority level of the stream (low, medium, high, or 
very high). 


You can find detailed information about this register on page 190 of the STM32F411 reference manual 
(RM0383). Next, we have the DMA Stream Number of Data Register (DMA_SxNDTR). 


DMA Stream Number of Data Register (DMA_SxNDTR) 


The DMA Stream Number of Data Register (DMA_SxNDTR) specifies the number of data items to be 
transferred by the DMA stream. This register is crucial for controlling the length of the data transfer. 


The only field in this register is NDT[15:0]. This stands for the number of data items. This field specifies 
the total number of data items to be transferred. The value in this register is decremented after each 
transfer until it reaches zero. 


Further information about this register can be found on page 193 of the reference manual. Let's move 
on to the DMA Stream Peripheral Address Register (DMA_SxPAR). 
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DMA Stream Peripheral Address Register (DMA_SxPAR) 


The DMA Stream Peripheral Address Register (DMA_SxPAR) holds the address of the peripheral 
data register that the data will be read to or written from. 


The only field in this register is PA[31:0]. This stands for peripheral address. This field contains the 
address of the peripheral data register involved in the data transfer. 


More details about this register can be found on page 194 of the reference manual. Finally, we have 
the DMA Stream Memory Address Registers (DMA_SxMOAR and DMA_SxM1AR). 


DMA Stream Memory Address Registers (DMA_SxMOAR and DMA_SxM1AR) 


These registers store the addresses of the memory locations used for data transfers. The DMA_SxMOAR 
register is used for single buffer mode, while both DMA_SxMOAR and DMA_SxM1AR are used in 
double buffer mode. 


The only field in these registers is MA[31:0]. This stands for memory address. This field contains the 
address of the memory location involved in the data transfer. For detailed information, refer to pages 
194 of the reference manual. 


Developing the ADC DMA driver 


In this section, we will develop three distinct DMA drivers - one for transferring ADC data, another 
for UART data, and a third for transferring data between memory locations. Let’s begin with the 
ADC DMA driver. 


The ADC DMA driver 


Create a copy of your previous project in your IDE, following the steps outlined in earlier chapters. 
Rename this copied project ADC_DMA. Next, create a new file named adc_dma.c in the Src folder 
and another file named adc_dma.h in the Inc folder. 


Populate your adc_dma.c file with the following code: 


#include "adc _dma.h" 


#define GPIOAEN (1U<<0) 
#define ADC1EN (1U<<8) 
#define CR1 SCAN (1U<<8) 
#define CR2 DMA (1U<<8) 
#define CR2 DDS (1U<<9) 
#define CR2 CONT (1U<<1) 
#define CR2 ADCON (1U<<0) 
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#define CR2_SWSTART (1U<<30) 
#define DMA2EN (1U<<22) 
#define DMA_SCR_EN 1U<<0) 
#define DMA_SCR_MINC 1U<<10) 
#define DMA_SCR_PINC 1U<<9) 
#define DMA_SCR_CIRC 1U<<8) 
#define DMA_SCR_TCIE 1U<<4) 
#define DMA_SCR_TEIE 1U<<2) 
#define DMA_SFCR_DMDIS (1U<<2) 


uint16_t adc_raw_data[NUM_OF CHANNELS] ; 


void adc dma init (void) 


{ 


[RR KK KKK KKKKEKGPTO Configuration*** xxx xxx x / 
/*Enable clock access to ADC GPIO Pin's Port*/ 
RCC->AHB1ENR |= GPIOAEN; 


/*Set PAO and PA1 mode to analog mode*/ 


GPIOA->MODER |= (1U<<0) ; 
GPIOA->MODER |= (1U<<1); 
GPIOA->MODER |= (1U<<2) ; 
GPIOA->MODER |= (1U<<3)j; 


[RRKKAKKKEKKEADC Configuration********e* / 
/*Enable clock access to ADC*/ 
RCC- >APB2ENR |= ADC1EN; 


/*Set sequence length*/ 
ADC1->SQR1 |= (1U<<20); 
ADC1->SQR1 &= ~(1U<<21); 
ADC1->SQR1 &= ~(1U<<22); 
ADC1->SQR1 &= ~(1U<<23) 


i 


/*Set sequence*/ 
ADC1->SQR3 = (0U<<0) | (1U<<5); 


/*Enable scan mode*/ 
ADC1->CR1 = CR1_SCAN; 
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/*Select to use DMA*/ 
ADC1->CR2 |=CR2 CONT |CR2_DMA|CR2_DDS; 


[RRRKKKKKKEXKKDMA Configuration*********x / 
/*Enable clock access to DMA*/ 
RCC- >AHB1ENR | =DMA2EN; 


/*Disable DMA stream*/ 
DMA2_ Stream0->CR &=~DMA_SCR_EN; 


/*Wait till DMA is disabled*/ 
while ((DMA2 Stream0->CR & DMA SCR_EN)) {} 


/*Enable Circular mode*/ 
DMA2_Stream0->CR |=DMA_SCR_CIRC; 


/*Set MSIZE i.e Memory data size to half-word*/ 


DMA2 Stream0->CR |= (1U<<13); 
DMA2 Stream0->CR &= ~(1U<<14); 


/*Set PSIZE i.e Peripheral data size to half-word*/ 
DMA2 Stream0->CR |= (1U<<11); 
DMA2 Stream0->CR &= ~(1U<<12); 


/*Enable memory addr increment*/ 
DMA2 Stream0->CR |=DMA_SCR_MINC; 


/*Set periph address*/ 


DMA2 Stream0->PAR = (uint32_t) (&(ADC1->DR) ) ; 
/*Set mem address*/ 
DMA2 Stream0->MOAR = (uint32 t) (&adc_raw_ data) ; 


/*Set number of transfer*/ 
DMA2_ Stream0->NDTR = (uint16_t)NUM_OF_ CHANNELS; 


/*Enable DMA stream*/ 
DMA2 Stream0->CR |= DMA _SCR_EN; 


[RRR KKKKKKKKKADC Configuration***** kx ee / 


/*Enable ADC*/ 
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} 


ADC1->CR2 |=CR2_ADCON; 


/*Start ADC*/ 
ADC1->CR2 |=CR2_SWSTART; 


Let’s go through each part of the code step by step to understand its purpose and functionality. 


1. 


Including the header file and defining constants: We start by including the adc_dma.h 
header file, which calls the stm32£4xx.h file and contains a macro for the number of 
channels of our DMA driver. We then define several constants using #def ine statements. 
These constants represent bit masks for various registers and control flags, making the code 
more readable and maintainable. 


Defining the ADC data array: We declare an array, adc_raw_data, that stores the raw ADC 
data. The size of this array is determined by a predefined constant, NUM_OF_CHANNELS. 


The ADC DMA initialization function: In the adc_dma_init function, we begin by 
enabling the clock for GPIOA, which is necessary for configuring the GPIO pins used by the 
ADC. We then set the mode of the PAO and PA1 pins to analog, as they are connected to the 
ADC channels. 


Next, we enable the clock for ADC1 and configure the ADC sequence length and channel 
sequence. We set the ADC to scan mode, allowing it to convert multiple channels sequentially. 
Additionally, we enable DMA and continuous conversion mode for the ADC. 


DMA configuration: We then enable the clock for DMA2 and ensure that the DMA stream 
is disabled before making any configurations. We configure the DMA stream for circular 
mode, which allows continuous data transfer. We set the memory and peripheral data sizes 
to half-word (16 bits). We enable memory address increment to correctly move through the 
adc_raw_data array and set the peripheral address to the ADC data register. We specify 
the number of data items to transfer and, finally, enable the DMA stream. 


Enable and start ADC: In the final steps, we simply enable the ADC and start the conversion 
process by setting the appropriate control bits. 


Our next task is to populate the adc_dma.h file. Here is the code: 


#ifndef ADC DMA H | 
#define ADC DMA H _ 
#include <stdint.h> 
#include "stm32f4xx.h" 
void adc dma init (void); 
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#define NUM OF CHANNELS 2 
#endif 


Let’s move on to the main . c file. Update your main . file, as shown here: 


#include <stdio.h> 
#include "uart.h" 
#include "adc_dma.h" 


extern uint1l6é t adc_raw_data[NUM_OF_CHANNELS] ; 


int main (void) 

{ 
/*Initialize debug UART*/ 
tiene saie) A 
/*Initialize ADC DMA*/ 
ade_dma_init(); 
while (1) 


{ 


printf ("Value from sensor one ‘el Nias tale rew Carallo) y 
printf ("Value from sensor two : %d \n\r ",adc_raw_data[1]); 
} 


for( int i = 0; i < 90000; i++) { 


} 


This code initializes the UART for debugging, and it sets up the ADC with DMA to continuously read 
sensor data from ADC channels connected to GPIO pins. In the main function, we start by initializing 
the UART for communication, and then we call the adc_dma_init function to configure the ADC 


and DMA for data transfers. In the infinite loop, we repeatedly print the values from two sensors 
stored in the adc_raw_data array to the console. 


To test the project, follow the steps we outlined in Chapter 11. Let’s proceed by developing the UART 
DMA driver. 


Developing the UART DMA driver 


Create a copy of your previous project in your IDE. Rename this copied project UART_DMA. Next, 
create a new file named uart_dma.c in the Src folder and another file named uart_dma.h in 
the Inc folder. Update your uart_dma.c file, as shown here: 


#include "uart_dma.h" 


#define UART2EN (1U<<17) 


#define 


#define 
#define 
#define 
#define 


#define 
#define 
#define 
#define 


#define 
#define 


#define 
#define 
#define 
#define 
#define 
#define 
#define 
#define 


#define 
#define 
#define 


#define 
#define 
#define 


#define 
#define 


static uint16_t compute uart_ bd(uint32_ t periph clk, uint32 t 
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baudrate) ; 


static void uart_ set _baudrate(uint32 t periph clk, uint32_t baudrate) ; 


char uart_data_buffer[UART_DATA BUFF_SIZE] ; 
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balinice) ic e uert cwoley 


void uart2_rx tx_init (void) 
[RRKKKKKKKKEK*CONFfigure UART GPIO PLANK kk kkk kkk RK KR KK / 
/*1.Enable clock access to GPIOA*/ 
RCC- >AHB1ENR |= GPIOAEN; 


/*2.Set PA2 mode to alternate function mode*/ 
GPIOA->MODER &= ~(1U<<4); 
GPIOA->MODER |= (1U<<5) ; 


/*3.Set PA3 mode to alternate function mode*/ 
GPIOA->MODER &= ~(1U<<6); 


GPIOA->MODER |= (alee) i 


/*4.Set PA2 alternate function function type to AF7(UART2_TX) */ 


GPIOA->AFR[0] |= (1U<<8); 
GPIOA->AFR[0] |= (1U<<9); 
GPIOA->AFR[0] |= (1U<<10) ; 
GPIOA->AFR[0] &= ~(1U<<11); 


/*5.Set PA3 alternate function function type to AF7(UART2_TX) */ 


GPIOA->AFR[0] |= (1U<<12) ; 
GPIOA->AFR[0] |= (1U<<13) ; 
GPIOA->AFR [0] le (1U<<14) ; 
GPIOA->AFR[0] &= ~(1U<<15) ; 


[KR RKKKKKKEKKKCONFigure UART Module% > k k kkk k k d kk k k KKK KKK / 


/*6. Enable clock access to UART2*/ 
RCC- >APB1ENR |= UART2EN; 


/*7. Set baudrate*/ 
uart_set_baudrate (CLK, UART_BAUDRATE) ; 


/*8. Select to use DMA for TX and RX*/ 
USART2->CR3 = CR3_DMAT |CR3_DMAR; 


/*9. Set transfer direction*/ 
USART2->CR1 = CR1_ TE |CR1_RE; 


/*10.Clear TC flag*/ 
USART2->SR &=~SR_TC; 
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/*11.Enable TCIE*/ 
USART2->CR1 |=CR1_TCIE; 


/*12. Enable uart module*/ 
USART2->CR1 |= CR1_UE; 


/*13.Enable USART2 interrupt in the NVIC*/ 
NVIC_EnableIRQ(USART2_IRQn) ; 


Next, we have the initialization function: 


void dmal_init (void) 
/*Enable clock access to DMA*/ 
RCC->AHB1ENR | =DMA1EN; 


/*Enable DMA Stream6 Interrupt in NVIC*/ 
NVIC_EnableIRQ(DMA1 Stream6 IRQn) ; 


And then, the function for configuring the rx stream: 


void dmal_stream5_uart_rx_config (void) 
/*Disable DMA stream*/ 
DMA1_Stream5->CR &=~DMA_SCR_EN; 


/*Wait till DMA Stream is disabled*/ 
while ((DMA1 Stream5->CR & DMA SCR_EN)) {} 


/*Clear interrupt flags for stream 5*/ 

DMA1->HIFCR = HIFCR_CDMEIF5 |HIFCR_CTEIF5|HIFCR_CTCIF5; 
/*Set periph address*/ 

DMA1 Stream5->PAR = (uint32_t) (&(USART2->DR) ) ; 


/*Set mem address*/ 
DMA1 Stream5->MOAR = (uint32_ t) (&uart_data_buffer) ; 


*Set number of transfer*/ 
DMA1 Stream5->NDTR = (uint16 t)UART DATA BUFF SIZE; 


SS 
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/*Select Channel 4*/ 


DMA1 Stream5->CR &= ~(1u<<25); 
DMA1 Stream5->CR &= ~(1lu<<26) ; 
DMA1 Stream5->CR |= (1lu<<27) ; 


/*Enable memory addr increment*/ 
DMA1 Stream5->CR |=DMA_SCR_MINC; 


/*Enable transfer complete interrupt*/ 
DMA1 Stream5->CR |= DMA _SCR_TCIE; 


/*Enable Circular mode*/ 
DMA1 Stream5->CR |=DMA_SCR_CIRC; 


/*Set transfer direction : Periph to Mem*/ 
DMA1 Stream5->CR &=~(1U<<6) ; 
DMA1 Stream5->CR &=~ (1U<<7); 


/*Enable DMA stream*/ 
DMA1 Stream5->CR |= DMA_SCR_EN; 


/*Enable DMA Stream5 Interrupt in NVIC*/ 
NVIC_EnableIRQ(DMA1 Stream5 IRQn) ; 


And then, the one for the tx stream 


void dmal_stream6é uart_tx_config(uint32 t msg to_snd, uint32_ t msg_ 
len) 


/*Disable DMA stream*/ 
DMA1_Stream6->CR &=~DMA_SCR_EN; 


/*Wait till DMA Stream is disabled*/ 
while ((DMA1 Stream6é->CR & DMA SCR_EN)) {} 


/*Clear interrupt flags for stream 6*/ 
DMA1->HIFCR = HIFCR_CDMETIF6 |HIFCR_CTEIF6|HIFCR_CTCIF6; 


/*Set periph address*/ 
DMA1 Stream6->PAR = (uint32_t) (&(USART2->DR) ); 


Developing the UART DMA driver 369 


/*Set mem address*/ 
DMA1 Stream6é->MOAR = msg_to_snd; 


/*Set number of transfer*/ 
DMA1 Stream6é->NDTR = msg_len; 


/*Select Channel 4*/ 

DMA1 Stream6->CR &= ~(1lu<<25)j; 
DMA1 Stream6->CR &= ~(1u<<26) ; 
DMA1 Stream6->CR |= (1lu<<27); 


/*Enable memory addr increment*/ 
DMA1 Stream6->CR |=DMA_ SCR_MINC; 


/*Set transfer direction :Mem to Periph*/ 
DMA1 Stream6->CR |=(1U<<é6) ; 
DMA1 Stream6->CR &=~ (1U<<7); 


/*Set transfer complete interrupt*/ 
DMA1 Stream6->CR |= DMA _SCR_TCIE; 


/*Enable DMA stream*/ 
DMA1 Stream6->CR |= DMA _SCR_EN; 


} 


Next, the function for computing the baudrate value for the UART: 


static uint16_t compute _uart bd(uint32_t periph clk, uint32 t 
baudrate) 


{ 
} 


return ((periph clk +( baudrate/2U ))/baudrate) ; 


And then, the function for writing the baudrate value to the baudrate register: 


static void uart_ set _baudrate(uint32 t periph clk, uint32_t baudrate) 


{ 


USART2->BRR = compute _uart_bd(periph_clk,baudrate) ; 
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Next, we have the interrupt handler for Streamé: 


void DMA1 Stream6é_IRQHandler (void) 


{ 


if ((DMA1->HISR) & HIFSR_TCIF6) 


{ 


//do_ssomething 

g exvempilt = iy 

/*Clear the flag*/ 
DMA1->HIFCR |= HPRCRECICEERG:; 


} 


And then, the interrupt handler for Stream5: 


void DMA1 Stream5 IRQHandler (void) 


{ 


if ((DMA1->HISR) & HIFSR_TCIF5) 


{ 


g_rx_cmplt = 1; 


/*Clear the flag*/ 
DMA1->HIFCR le ETRCRECRETESE 


} 


void USART2_IRQHandler (void) 


{ 


G uartyempilit =" ai; 


/*Clear TC interrupt flag*/ 
USART2->SR &=~SR_TC; 


} 


Let’s go through each part of the code step by step. 
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UART initialization 


Inthe uart2_rx_tx_init function, we start by configuring the GPIO pins for UART2 communication. 
We enable the clock for GPIOA, ensuring that the PA2 and PA3 pins can be used. Setting PA2 and 
PA3 to alternate function mode allows them to serve as UART2_TX and UART2_ RX, respectively. 
We further specify the alternate function type to AF7, which is the type for UART2 operations. 


With the GPIO configuration complete, we proceed to enable the clock for UART2. We set the baud 
rate using the uart_set_baudrate function. By enabling DMA for both transmission and 
reception, we offload data handling from the CPU, allowing for more efficient data transfers. We set 
the transfer direction to both transmit and receive, clear any pending transmission complete flags, and 
enable the transmission complete interrupt. Finally, we enable the UART module and configure the 
NVIC to handle UART2 interrupts, ensuring that the system can respond to UART events promptly. 


DMA initialization 


Here, we start by enabling the clock for DMA1, ensuring that the DMA controller is powered and ready 
for configuration. Additionally, we enable the DMA Stream6 interrupt in the NVIC, preparing the 
system to handle DMA-related interrupts efficiently. 


DMA configuration for UART reception 


In dmal_stream5_uart_rx_config, we configure DMA1 Stream5 to receive data via UART2. 
First, we disable the DMA stream to safely configure its parameters. After ensuring that the stream 
is disabled, we clear any existing interrupt flags. We set the peripheral address to the UART2 data 
register (USART2-DR) and the memory address to our uart_data_buffer, where incoming 
data will be stored. 


We specify the number of data items to transfer and select the appropriate DMA channel. Enabling 
memory address increment mode ensures that the data buffer is filled sequentially. Circular mode 
is enabled to allow continuous data reception, and the transfer direction is set from peripheral to 
memory. We enable the DMA stream and configure the NVIC to handle Stream5 interrupts, ensuring 
that the system is prepared for DMA events. 


DMA configuration for UART transmission 


The dmal_stream6_uart_tx_config function configures DMA] Stream6 to transmit data via 
UARTZ2. Similar to the reception configuration, we start by disabling the DMA stream and clearing 
any existing interrupt flags. We set the peripheral address to the UART2 data register and the memory 
address to the data buffer that will be transmitted. 


We specify the number of data items to transfer and select the appropriate DMA channel. Memory 
address increment mode is enabled to ensure that data is transmitted sequentially from the buffer. 
We set the transfer direction from memory to peripheral and enable the transfer complete interrupt. 
Finally, we enable the DMA stream to start the data transmission process. 
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Helper functions 


The compute_uart_hd function calculates the UART baud rate setting based on the peripheral 
clock and desired baud rate. The uart_set_baudrate function uses this computed value to set the 
baud rate in the UART’s BRR register, ensuring that UART communication occurs at the correct speed. 


Interrupt handlers 
Let’s break down the interrupt handlers: 


e DMA1 Stream6_IRQHandler: This handler responds to DMA Streamé6 interrupts. When 
a transfer completes, it sets the g_tx_cmp1t flag and clears the interrupt flag, ensuring that 
the system is aware that the transmission is complete. 


e DMA1 Stream5_IRQHand1ler: This handler responds to DMA Stream5 interrupts. When 
a transfer completes, it sets the g_rx_cmp1t flag and clears the interrupt flag, indicating that 
new data has been received. 


e USART2_IRQHandler: This handler manages UART2 interrupts. It sets the g uart_cmplt 
flag when a UART event occurs and clears the transmission complete flag, maintaining the 
proper flow of UART communication. 


Next, we populate the uart_dma.h file. Here is the code: 


#ifndef UART DMA H _ 

#define UART DMA H _ 

#include <stdint.h> 

#include "stm32f4xx.h" 

#define UART DATA BUFF SIZE 5 
void uart2_ rx tx_init (void); 

void dmal_ init (void); 

void dmal_stream5 uart rx config(void) ; 


void dmal_stream6é uart_tx_config(uint32 t msg to_snd, uint32 t msg_ 
len) ; 


#endif£ 


And then, we populate the main. c file: 


#include <stdio.h> 
#include <string.h> 


#include "uart.h" 
#include "uart_dma.h" 


extern uint8 t g_rx_cmplt; 
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externas tomco c e uere cwoley 
ektern timto c g es GmIle; 


extern char uart_data_buffer[UART DATA BUFF_SIZE]; 
char msg buff[150] ={'\0o'}; 


int main(void) 


{ 


vaea mos ers kane (p 

dmal_ init (); 

dmal_stream5 uart_rx_config(); 

sprintf (msg buff,"Initialization...cmplt\n\r") ; 
dmal_stream6é_uart_tx_config((uint32_t)msg_buff,strlen(msg_ buff) ) ; 


while(!g tx _cmplt) {} 


while (1) 


IIe (Sl ses Chase), 

{ 
sprintf(msg buff, "Message received : %s \r\n",uart_data_ 
putten) 
Gj m amol = Of 
Gj ts Culolle = Of 
G vene Gwol = Op 
dmal_stream6é_uart_tx_config((uint32_t)msg_buff,strlen(msg_ 
buf£) ) ; 
while(!g tx _cmplt) {} 


} 


In the main function, we start by initializing the UART and DMA, and then we configure DMA1 
Stream5 for UART reception and DMA1 Stream6 for UART transmission. We prepare a message that 
indicates initialization completion and initiate its transmission via DMA. The main loop continuously 
checks whether a UART message has been received. When a message is received, it formats the received 
data into a response message, resets the completion flags, and transmits the response using DMA. 
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Testing the project 


To test the project, compile the code and upload it to your microcontroller. Open RealTerm or any other 
serial terminal application, and then configure it with the appropriate port and baud rate to view the 
debug messages. Press the black push button on the development board to reset the microcontroller. 
Ensure the output area of RealTerm is active by clicking on it. Then, type any five keys on your keyboard. 
You should see these keys appear in the output area of RealTerm. The microcontroller receives the typed 
keys through the dmal_stream5_uart_rx_config function, stores them in themsg_buff, 
and transmits them to your host computer's serial port via the dmal_stream6é_uart_tx_ config 
function. The last received data remains in msg_buf f for further processing if needed. We type five 
characters because the UART_DATA BUFF_SIZEis set to 5 inthe uart_dma.h file. 


In the next section, we will develop our final DMA driver - the DMA memory-to-memory driver. 


Developing the DMA memory-to-memory driver 


Create a copy of your previous project in your IDE and rename it DMA_MemToMem. Next, create a 
new file named dma. c in the Src folder and another file named dma .h in the Inc folder. Update 
your dma . c file, as shown here: 


#include "dma.h" 


#define DMA2EN (GU 223) 
#define DMA SCR_EN (1U<<0) 
#define DMA SCR _MINC (1U<<10) 
#define DMA SCR_PINC (1U<<9) 
#define DMA SCR TCIE (1U<<4) 
#define DMA SCR TEIE (1U<<2) 
#define DMA SFCR DMDIS (1U<<2) 


void dma2 mem2mem config (void) 


{ 


/*Enable clock access to the dma module*/ 
RCC- >AHB1ENR |= DMA2EN ; 


/*Disable dma stream*/ 
DMA2_ Stream0->CR = 0; 


/*Wait until stream is disabled*/ 
while ((DMA2 Stream0->CR & DMA SCR_EN)) {} 


/*Configure dma parameters*/ 


/*Set MSIZE i.e Memory data size to half-word*/ 
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} 


DMA2_ Stream0->C 
DMA2_ Stream0->C 


/*Sete PSEZE ive 
DMA2_ Stream0->C 
DMA2 Stream0->C 


/*Enable memory 
DMA2 Stream0->C 


/*Enable periph 
DMA2_ Stream0->C 


/*Select mem-to 
DMA2_ Stream0->C 
DMA2 Stream0->C 


/*Enable transf 
DMA2 Stream0->C 


/*Enable transf 
DMA2_ Stream0->C 


/*Disable direc 
DMA2 Stream0->F 


/*Set DMA FIFO 
DMA2 Stream0->F 
DMA2 Stream0->F 


/*Enable DMA in 


R j= (GlWec<ils) - 
R & ~(1U<<14); 


Peripheral data size to half-word*/ 
R = (lWiccili) p 
R &= ~(1U<<12); 


addr increment*/ 
R |=DMA_SCR_MINC; 


eral addr increment*/ 
R |=DMA_SCR_PINC; 


-mem transfer*/ 
R & ~(1U<<6); 
a = (iesms 


er complete interrupt*/ 
R |= DMA _SCR_TCIE; 


er error interrupt*/ 
R |= DMA SCR _TEIE; 


t mode*/ 
CR |=DMA_SFCR_DMDIS; 


threshold*/ 
CR |=(1U<<0) ; 
CR |=(1U<<1); 


terrupt in NVIC*/ 


NVIC_EnableIRQ(DMA2 Stream0 IRQn) ; 


This function sets up the DMA2 controller for memory-to-memory data transfers. It begins by enabling 
the clock for the DMA2 module and ensures that the DMA stream is disabled before making any 
configuration changes. The function configures the data size for both memory and peripheral to 
half-word (16-bit) and enables automatic incrementing of the memory and peripheral addresses. 
It sets the transfer direction to memory-to-memory and enables interrupts for transfer completion 
and transfer errors, ensuring robust error handling and efficient operation. Direct mode is disabled 
to use FIFO mode, and the FIFO threshold is set to fu11. Finally, the function enables the DMA 
stream and configures the NVIC to handle DMA interrupts, ensuring that the system can respond to 
DMA events appropriately. We also have the dma_transfer start function: 
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void dma transfer start(uint32 t src_buff, uint32 t dest_buff, 
Uimts te Aen) 


{ 


/*Set peripheral address*/ 
DMA2 Stream0->PAR = src_ buff; 


/*Set memory address*/ 
DMA2 Stream0->MOAR = dest_buff; 


/*Set transfer length*/ 
DMA2 Stream0->NDTR = len; 


/*Enable dma stream*/ 
DMA2_ Stream0->CR |= DMA _SCR_EN; 


} 


This function initiates the DMA transfer by configuring the source and destination addresses and the 
length of the data transfer. It begins by setting the peripheral address of the value passed in src_buff 
and the memory address of the value passed in dest_buf f. The transfer length is then specified by 
setting the NDTR register to Len, indicating the number of data items to transfer. Finally, the function 
enables the DMA stream by setting the EN bit in the CR register, thereby starting the data transfer 
from the source to the destination. This is the dma . h file: 


#ifndef DMA H _ 
#define DMA H _ 
#include <stdint.h> 
#include "stm32f4xx.h" 


#define LISR TCIFO (1U<<5)) 
#define LIFCR_CTCIFO (1U<<5) 
#define LISR TEIFO (1U<<3) 
#define LIFCR_CTEIFO (1U<<3) 


void dma2 mem2mem config (void); 


void dma _ transfer start(uint32 t src_buff, uint32 t dest_buff, 
ilalintec\} te eni 


#endif 


And this is the main . c file: 


#include <stdio.h> 
#include <string.h> 
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alinelkbrels Witieicie . lav 
#include "dma.h" 


#include "uart.h" 
#define BUFFER SIZE 5 
uinti t sensor _data_arr[BUFFER_SIZE] = {892,731,1234,90,23}; 
uintlé_ t temp_data_arr[BUFFER_SIZE] ; 
volatile uint8 t g_transfer_cmplt; 
int main(void) 
g_ transfer iempilt = 0); 


taise mieh) 5 
dma2_mem2mem_config(); 


dma_transfer_ start ((uint32_t)sensor data_arr, (uint32_t) temp_data_ 
arr, BUFFER _SIZE) ; 


/*Wait until transfer complete*/ 
while(!g transfer _cmplt) {} 


for( int i = 0; i < BUFFER SIZE; i++) 


{ 


printf ("Temp buffer[%d]: %d\r\n",i,temp_data_arr[i]); 


g_transtervempiit = 0); 


while(1) 


} 


The main function sets up and initiates a memory-to-memory DMA transfer, transferring data from the 
globally declared and initialized sensor_data_arr to the uninitialized temp_data_arr. It starts 
by initializing the transfer complete flag, g transfer _cmp1t, to 0, ensuring that we can monitor 
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the transfer status. The function then initializes UART for debugging purposes and configures the DMA 
using dma2_mem2mem_config. The DMA transfer is started by calling dma_transfer_start, 
specifying the source (Sensor_data_arr), destination (temp_data_arr), and the length of 
the transfer (BUFFER_SIZE). The function then enters a loop, waiting until the transfer is complete, 
indicated by g_transfer_cmp1t being set to 1. Once the transfer is complete, it prints the contents 
of the temp_data_arr to the console, confirming that the data has been successfully transferred. 


Our main. c file also contains the DMA streams IRQHandler: 


void DMA2 Stream0_IRQHandler (void) 


{ 
/*Check if transfer complete interrupt occurred*/ 
if ((DMA2->LISR) & LISR_TCIFO) 
{ 
g transfer cmplt = 1; 
/*Clear flag*/ 
DMA2->LIFCR |=LIFCR_CTCIFO; 
} 
/*Check if transfer error occurred*/ 
if ((DMA2->LISR) & LISR_TEIFO) 
{ 
/*Do something...*/ 
/*Clear flag*/ 
DMA2->LIFCR le IIHS (CARNAL) 9 
} 
} 


The handler manages both transfer completion and error events. The function first checks whether 
the transfer complete interrupt flag (TCI FO) is set in the low interrupt status register (LISR). If this 
flag is set, it indicates that the DMA transfer has successfully finished. The function then sets the 
g_transfer_cmp1t flag to 1 to signal to the main function that the transfer is complete, and it 
clears the interrupt flag by writing to the low interrupt flag clear register (LIFCR). Additionally, the 
function checks for a transfer error interrupt (TEIFO). Ifa transfer error is detected, it performs any 
necessary error handling and clears the error flag in LIFCR. This interrupt handler ensures smooth 
operation by promptly handling the completion of data transfers and addressing any errors that might 
occur during the process. 


Now, it’s time to test the project. To test the project, compile the code and upload it to your microcontroller. 
Open RealTerm or another serial terminal application, and then configure it with the appropriate port 
and baud rate to view the debug messages. Press the black push button on the development board to 
reset the microcontroller. You should see the sensor values printed, indicating that the values have 
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been successfully copied from sensor_data_arrtotemp_data_arr, as our code only prints 
the contents of temp_data_arr. 


Summary 


In this chapter, we learned about DMA, an important feature in microcontrollers for enhancing data 
throughput and offloading the CPU from routine data transfer tasks. We began by discussing the basic 
principles of DMA, emphasizing its role in high-performance embedded systems. We explored how 
DMA works and its significance in improving system efficiency, by allowing peripherals to transfer 
data directly to and from memory without continuous CPU intervention. 


Then, we focused on the STM32F4 implementation of the DMA controller, examining its key features 
and configuration options. We detailed the structure of the DMA controller, including its channels, 
streams, and key registers, such as the Stream Configuration Register (DMA_SxCR), Stream Number 
of Data Register (DMA_SxNDTR), Stream Peripheral Address Register (DMA_SxPAR), and Stream 
Memory Address Registers (DMA_SxMOAR and DMA_SxM1AR). This provided a comprehensive 
understanding of how to configure and control DMA for various data transfer operations. 


We provided practical examples to solidify our understanding, including the development of DMA 
drivers for different use cases, such as ADC data transfers, UART communications, and memory-to- 
memory transfers. These examples involved initializing DMA, setting up the necessary parameters, 
and implementing functions to handle data transfers efficiently. 


In the next chapter, we will learn about power management energy efficiency techniques in 
embedded systems. 
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Power Management 
and Energy Efficiency in 
Embedded Systems 


In this chapter, we will delve into power management and energy efficiency in embedded systems, a 
critical aspect in today’s technology-driven world. Efficient power management is vital for prolonging 
battery life and ensuring optimal performance in embedded devices. This chapter aims to equip you 
with the necessary knowledge and skills to implement effective power management techniques in 
your designs. 


We will begin by exploring various power management techniques, laying the foundation to understand 
how to reduce power consumption in embedded systems. Following this, we will examine the different 
sleep modes and low-power states available in STM32F4 microcontrollers, providing detailed insights 
into their configurations and applications. Then, we will discuss the wake-up sources and triggers in 
the STM32F4, which are essential to ensure that the microcontroller can respond promptly to external 
events. Finally, we will put theory into practice by developing a driver to enter standby mode and 
wake up the microcontroller, demonstrating how to apply these concepts in real-world scenarios. 


In this chapter, we will cover the following main topics: 


e An overview of power management techniques 
e Low-power modes in STM32F4 
e Wake-up sources and triggers in STM32F4 


e Developing a driver to enter standby mode and wake up 


By the end of this chapter, you will have a thorough understanding of power management in embedded 
systems and be able to implement energy-efficient designs using STM32F4 microcontrollers. This 
knowledge will enable you to create embedded systems that optimize power consumption and extend 
battery life, essential for modern applications. 
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Technical requirements 


All the code examples for this chapter can be found on GitHub at https://github.com/ 
Packt Publishing/Bare-Metal -Embedded-C- Programming. 


An overview of power management techniques 


In this section, we will explore the world of power management techniques, a crucial aspect of 
embedded systems design. As our devices become more advanced and our expectations for battery 
life increase, understanding how to manage power effectively is more important than ever. Let's dive 
into the various power management techniques and how they are implemented, taking a look at some 
case studies to see these techniques in action. 


Power management in embedded systems involves a combination of hardware and software strategies 
designed to reduce energy consumption. This is particularly important for battery-powered devices, 
where efficient power usage can significantly extend battery life. The main techniques we'll cover 
include Dynamic Voltage and Frequency Scaling (DVFS), clock gating, power gating, and utilizing 
low-power modes. 


Let’s start with DVFS. 


Dynamic Voltage and Frequency Scaling (DVFS) 


DVFS is a method where the voltage and frequency of a microcontroller are adjusted based on the 
workload. By lowering the voltage and frequency during periods of low activity, power consumption 
can be greatly reduced. Conversely, during periods of high demand, the voltage and frequency are 
increased to ensure performance. 


How is DVFS implemented? 


In STM32 microcontrollers, DVFS can be managed through specific power control registers. These 
registers allow a system to dynamically adjust the operating points based on the required performance 
levels. For example, the STM32F4 series has several power modes that can be configured to adjust 
the system clock and core voltage. 


An example use case - mobile phones 


Mobile phones are a prime example of DVFS in action. When a phone is idle, it reduces the CPU 
frequency and voltage to save the battery. As soon as you start using an app or playing a game, the 
CPU ramps up its frequency and voltage to provide the necessary performance. This balance between 
performance and power savings is what makes modern smartphones so efficient. 


Another common technique is clock gating. 
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Clock gating 


Clock gating is a technique where the clock signal to certain parts of a microcontroller is turned off 
when they are not in use. This prevents unnecessary switching of transistors, which in turn saves power. 


How is clock gating implemented? 


Clock gating is typically controlled through clock control registers. In the STM32 series, each peripheral’s 
clock can be enabled or disabled individually using these registers. For instance, if a particular peripheral 
such as the ADC is not needed, its clock can be disabled to save power. 


An example use case - smart home devices 


Smart home devices such as smart thermostats or lights use clock gating to manage power efficiently. 
These devices spend a significant amount of time in a low-power state, waking up only to perform 
specific tasks. By gating the clock to unused peripherals, these devices can conserve energy and extend 
their battery life. 


Another technique is power gating. 


Power gating 


Power gating takes power savings a step further by completely shutting off the power to certain parts 
of a microcontroller. This technique ensures zero power consumption for the powered-down sections. 


How is power gating implemented? 


Power gating is more complex than clock gating and often involves dedicated power management 
units within a microcontroller. These units control the power supply to various domains of the 
microcontroller. In STM32 microcontrollers, power gating can be configured using the power control 
registers to turn off specific peripherals, or even entire sections of the microcontroller. 


An example use case - wearable devices 


Wearable devices, such as fitness trackers, benefit greatly from power gating. These devices need to 
operate for extended periods on a single charge. By powering down sensors and other components 
when they are not in use, wearables can achieve longer battery life without compromising functionality. 


Next, let’s discuss low-power modes. 


Low-power modes 


Low-power modes are predefined states within microcontrollers that significantly reduce power 
consumption by disabling or reducing the functionality of various components. These modes range 
from simple CPU sleep modes to more complex deep sleep or standby modes. 
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How are low-power modes implemented? 


Low-power modes are implemented through the power control registers. The STM32F4 microcontrollers, 
for example, offer several low-power modes, including sleep, stop, and standby. Each mode provides 
a different balance between power savings and wake-up time. 


An example use case - remote sensors 


Remote sensors used in agriculture or environmental monitoring often use low-power modes. These 
sensors might spend the majority of their time in a low-power state, waking up periodically to take 
measurements and transmit data. By leveraging low-power modes, these sensors can operate for 
months or even years on a single battery charge. 


Now, lets take a closer look at a couple of case studies that illustrate how a combination of these power 
management techniques is used in real-world applications. 


Case study 1 - an energy-efficient smartwatch 


Smartwatches are a great example of a device that relies heavily on power management techniques. 
These devices need to balance performance with battery life, as users expect them to run for days 
on a single charge. Let’s break down the roles the different techniques play in an energy-efficient 
smartwatch design: 


e DVFS: The smartwatch uses DVFS to adjust the CPU frequency based on the current workload. 
When the user interacts with the watch, the CPU frequency increases to provide a smooth 
experience. When the watch is idle, the frequency is lowered to save power. 


e Clock gating: Peripherals such as the GPS or heart rate monitor are only powered when needed. 
When these features are not in use, their clocks are gated to conserve energy. 


e Power gating: Components like the display driver are powered down completely when the 
display is off. 


e Low-power modes: The watch enters deep sleep mode during periods of inactivity, waking up 
only to check for notifications or user interactions. 


By combining these techniques, smartwatches can achieve impressive battery life without compromising 
on functionality. Another excellent example is solar-powered environmental monitoring. 
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Case study 2 - a solar-powered environmental monitor 


A solar-powered environmental monitor deployed in remote locations must operate efficiently to 
ensure continuous data collection and transmission. The roles are as follows: 


e DVFS: The monitor adjusts its operating frequency based on the intensity of sunlight and battery 
charge. During peak sunlight hours, it operates at a higher frequency to process more data. 


e Clock gating: Sensors such as temperature, humidity, and air quality are only active during 
data collection intervals. The clocks to these sensors are gated when not in use. 


e Power gating: Non-essential components are completely powered down during nighttime or 
cloudy periods to conserve energy. 


e Low-power modes: The monitor enters deep sleep mode between data collection intervals, 
waking up periodically to take measurements and transmit data. 


With these power management techniques, the monitor can operate autonomously for extended 
periods, relying solely on solar power. 


Power management is a vital aspect of embedded system design, especially as devices become more 
portable and battery-dependent. By understanding and implementing techniques such as DVFS, clock 
gating, power gating, and low-power modes, we can design embedded systems that are both powerful 
and energy-efficient. Whether it's a smartwatch, a remote sensor, or any other battery-powered device, 
effective power management ensures longer battery life and better overall performance. As we continue 
to push the boundaries of what embedded systems can do, mastering these power management 
techniques will be more important than ever. 


In the next section, we will explore the low-power modes in our STM32F4 microcontroller. 


Low-power modes in STM32F4 


In this section, we will learn about the low-power modes available in STM32F4 microcontrollers. 
We'll cover the various low-power modes, how to configure them, and the practical aspects of using 
them in your projects. 


Let’s start by understanding these low-power modes. Low-power modes in the STM32F4 microcontrollers 
are designed to reduce power consumption by disabling or limiting the functionality of certain 
components. The STM32F4 offers several low-power states, each providing a different balance between 
power savings and wake-up latency. These modes include sleep, stop, and standby modes. 


We can put our system into low-power mode by executing the Wait For Interrupt (WFI) or Wait For 
Event (WFE) instructions, or setting the SLEEPONEXTIT bit in the Cortex’-M4 with FPU system 
control register on return from an ISR. 


Let’s dive into the details of each low-power mode, starting with sleep mode. 
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Sleep mode 


Sleep mode is the most basic low-power mode, where the CPU clock is stopped but peripherals 
continue to operate. This mode offers a quick wake-up time, making it ideal for applications that 
require frequent transitions between active and low-power states. 


To enter sleep mode, we need to clear the SLEEPDEEP bit in the System Control Register (SCR) 
and then execute the WFI or WFE instruction, as shown in the following snippet: 


void enter sleep mode(void) { 
// Clear the SLEEPDEEP bit to enter Sleep mode 
SCB->SCR &= ~SCB SCR_SLEEPDEEP Msk; 


// Request Wait For Interrupt 
WEA); 


} 


The microcontroller exits Sleep mode upon any interrupt or event. Since the peripherals remain 
active, any configured interrupt from a peripheral can wake the CPU. 


An example use case is sensor monitoring. 


For applications such as continuous sensor monitoring, sleep mode provides an efficient way to reduce 
power consumption without sacrificing responsiveness. The microcontroller can wake up quickly to 
process sensor data and then return to sleep mode. 


The next mode is stop mode. 


Stop mode 


Stop mode offers a deeper power-saving state than sleep mode by stopping the main internal 
regulator and halting the system clock. Only the low-speed clock (LSI or LSE) remains active. This 
mode provides a moderate wake-up time and significant power savings. 


To enter stop mode, set the SLEEPDEEP bit in the Power Control Register (PWR_CR), and then 
execute the WFI or WFE instruction, as shown in the following snippet. Additional configuration 
can also be applied to further reduce power consumption in stop mode: 


void enter stop mode(void) { 
// Set SLEEPDEEP bit to enable deep sleep mode 
SCB->SCR |= SCB_SCR_SLEEPDEEP_Msk; 
// Request Wait For Interrupt 
VETOR 
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The MCU exits stop mode upon any external interrupt or wake-up event from configured EXTI 
lines, RTC alarms, or other configured wake-up sources. The wake-up time from stop mode is longer 
than from sleep mode, but it still allows for a relatively quick return to full operation. 


An example use case is periodic data logging. 


In applications such as data logging, a microcontroller can remain in stop mode and wake up periodically, 
based on RTC alarms, to log data, and then return to stop mode. This significantly reduces power 
consumption while ensuring regular data logging. 


The final mode is standby mode. 


Standby mode 


Standby mode provides the highest power savings by turning off most internal circuitry, including 
the main regulator. Only a small portion of the microcontroller remains powered to monitor wake-up 
sources. This mode has the longest wake-up time but offers the lowest power consumption. 


To enter standby mode, set the PDDS and SLEEPDEEP bits in the Power Control (PWR_CR) register, 
and then configure the wake-up sources. This snippet demonstrates how to enter standby mode: 


void enter standby mode (void) { 
// Clear Wakeup flag 
PWR->CR |= PWR_CR_CWUF; 


// Set the PDDS bit to enter Standby mode 
PWR->CR |= PWR_CR_PDDS; 


// Set the SLEEPDEEP bit to enable deep sleep mode 
SCB->SCR |= SCB_SCR_SLEEPDEEP Msk; 


// Request Wait For Interrupt 
VETOR 


} 


The microcontroller exits standby mode upon a wake-up event from an external wake-up pin 
(WKUP), an RTC alarm, or a reset event. When the microcontroller wakes up from standby mode, 
it undergoes a full reset sequence, and the execution starts from the reset vector. 


An example use case is remote IoT devices. 


Standby mode is perfect for remote IoT devices that need to operate for extended periods on battery 
power. These devices can remain in standby mode most of the time and wake up only for critical 
events or scheduled tasks, thus maximizing battery life. 


Now that we understand how to enter the various low-power modes, we will look at how to wake up 
from them. 
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Wake-up sources and triggers from low-power modes in 
STM32F4 


While low-power modes help conserve energy, ensuring that a microcontroller can wake up promptly 
when needed is equally important. The STM32F4 microcontroller series provides a variety of wake-up 
sources and triggers to handle this effectively. In this section, we'll explore these wake-up sources, 
how they function, and their practical applications. 


Understanding wake-up sources 


Wake-up sources are mechanisms that bring a microcontroller out of a low-power state. The STM32F4 
offers several types of wake-up sources, each suited for different scenarios. These include external 
interrupts, RTC alarms, watchdog timers, and various internal events. By understanding these triggers, 
we can design systems that balance power efficiency with responsiveness. 


The wake-up sources can be grouped as follows: 
e External interrupts 
e Real-Time Clock (RTC) alarms 


e Internal events 
Let’s delve into each of these wake-up sources to understand how they work and their typical use cases. 
External interrupts 


External interrupts are one of the primary wake-up sources for STM32F4 microcontrollers. These 
interrupts can be triggered by events on specific GPIO pins. When a microcontroller is in a low-power 
mode, an external signal, such as a button press or sensor output, can wake it up. 


Heres how it works: 


e GPIO configuration: Configure the GPIO pins to act as interrupt sources. This involves setting 
pin mode and enabling the interrupt on the desired edge (rising, falling, or both). 


e EXTI configuration: Each GPIO pin can be mapped to an EXTI line, which can be configured 
to generate an interrupt. 


e NVIC configuration: Enable the EXT] line interrupt in the Nested Vectored Interrupt 
Controller (NVIC) to ensure that the microcontroller responds to the external event. 


Example use cases are a smart doorbell system and smart lighting. 


Wake-up sources and triggers from low-power modes in STM32F4 


Imagine a smart doorbell system. The microcontroller remains in a low-power mode to conserve battery 
life. When someone presses the doorbell button (connected to a GPIO pin), an external interrupt is 
triggered, waking the microcontroller to process the event and send a notification to the homeowner. 
Another excellent example is smart home lighting systems. 


A smart home lighting system needs to conserve energy while being responsive to user inputs. The 
microcontroller stays in a low-power mode until an external interrupt (e.g., a motion sensor detecting 
movement) wakes it up. Upon waking, the microcontroller processes the event, turns on the lights, 
and then goes back to sleep after a predefined period of inactivity. 


The next wake-up source we will examine is the RTC Alarm. 


RTC alarms 


The RTC is a versatile peripheral that can generate wake-up events at specific intervals or predefined 
times. It is particularly useful for applications requiring periodic wake-ups, such as data logging or 
scheduled tasks. 


Heres how it works: 


e RTC configuration: Configure the RTC to generate alarms or periodic wake-up events. This 
involves setting the RTC clock source, enabling the wake-up timer, and setting the alarm time. 


¢ Interrupt handling: Enable the RTC alarm or wake-up interrupt in the NVIC to ensure that 
the microcontroller wakes up when the alarm or timer event occurs. 


An example use case is an environmental monitoring system. 


Consider a remote environmental monitoring system that logs temperature and humidity data. The 
microcontroller can be put into low-power mode, waking up at regular intervals (e.g., every hour) 
using RTC alarms to read sensors and log data, and then return to the low-power state. 


The final wake-up sources we will examine are internal events. 


Internal events 


Apart from external triggers, internal events can also wake up a microcontroller from low-power 
modes. These events include the following: 


e Peripheral events: Events generated by internal peripherals, such as ADC conversions or 
communication interface activity 


e System events: Internal system events such as power voltage detection or clock stability issues 
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Heres how it works: 


e Peripheral configuration: Configure the peripheral to generate interrupts upon specific events. 
For instance, an ADC can generate an interrupt when a conversion is complete. 


e Event handling: Enable the relevant interrupts in the NVIC to handle these internal events 
and wake the microcontroller. 


An example use case is a fitness tracker 


A wearable fitness tracker that monitors heart rate can use the ADC to read sensor data. The 
microcontroller stays in a low-power mode and wakes up when the ADC completes a conversion, 
allowing it to process and store the heart rate data. 


Before we conclude this section, let’s summarize some key practical considerations to keep in mind 
when configuring wake-up sources. 


Practical considerations 
When configuring wake-up sources, you must consider the following practical aspects: 


e Response time: Ensure the chosen wake-up source can provide the required response time for 
your application. External interrupts typically offer the fastest wake-up times. 


e Power consumption: Balance power consumption with wake-up requirements. RTC alarms 
and watchdog timers can be configured for periodic wake-ups with minimal power overhead. 


e Reliability: Choose reliable wake-up sources for critical applications. Watchdog timers are 
essential for safety-critical systems to ensure that a microcontroller can recover from faults. 


e Peripheral configuration: Ensure that peripherals needed for wake-up are properly configured 
and their clocks remain enabled, even in low-power states. 


Understanding and properly configuring these wake-up sources ensures that your embedded systems 
are both energy-efficient and reliable. 


In the next section, we will learn how to develop a driver to enter standby mode and subsequently 
wake up the system. 


Developing a driver to enter standby mode and wake up 


Create a copy of your previous project in your IDE, following the steps outlined in earlier chapters. 
Rename this copied project StandByModeWithWakeupPin. Next, create a new file named 
standby _mode.c in the Src folder, and then another file named standby_mode.h in the 
Inc folder. 
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Populate your standby _mode.c file with the following code: 


#include "standby mode.h" 
#define PWR MODE STANDBY (PWR_CR_PDDS) 
#define WK PIN (1U<<0) 


static void set power mode(uint32 t pwr_mode) ; 
void wakeup pin init (void) 


{ 


//Enable clock for GPIOA 


RCC- >AHB1ENR l= RCC_AHB1ENR_ GPIOAEN; 
//Set PAO as input pin 

GPIOA->MODER &= ~(1U<<0O); 
GPIOA->MODER &= ~(1U<<1); 

//No pull 

GPIOA->PUPDR &= ~(1U<<0O); 
GPIOA->PUPDR &= ~(1U<<1); 


} 


This function is responsible for configuring PAO to be used as a wake-up pin for exiting low-power 
mode. It sets PAO as an input pin by clearing the corresponding bits in the GPIOA mode register, and 
then it configures the pin with no pull-up or pull-down resistors: 


void standby wakeup pin setup (void) 


{ 


/*Wait for wakeup pin to be released*/ 
while(get_ wakeup pin state() == 0) {} 


/*Disable wakeup pin*/ 
PWR->CSR &=~(1U<<8) ; 


/*Clear all wakeup flags*/ 
PWR->CR |=(1U<<2); 


/*Enable wakeup pin*/ 
PWR->CSR |=(1U<<8) ; 


/*Enter StandBy mode*/ 
set_power_ mode (PWR_MODE STANDBY) ; 
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/*Set SLEEPDEEP bit in the CortexM System Control Register*/ 
SCB -SCR |=(1U<<2); 


/*Wait for interrupt*/ 
__ PIE (O) 


} 


This function prepares the microcontroller to enter standby mode and ensures that it can wake up 
via the configured wake-up pin. It begins by waiting for the wake-up pin to be released, ensuring that 
the pin is in a stable state before proceeding. The function then disables the wake-up pin to clear any 
residual settings, followed by clearing all wake-up flags to reset the wake-up status. After re-enabling 
the wake-up pin, the function sets the power mode to Standby by configuring the appropriate power 
control register. Finally, the function executes the WFI instruction, placing the microcontroller into 
standby mode until an interrupt occurs, triggering the wake-up process: 


uint32 t get wakeup pin state(void) 


{ 
} 


return ((GPIOA->IDR & WK_ PIN) == WK PIN); 


This function checks the current state of the wake-up pin (PAO). It reads the input data register (IDR) 
for GPIOA and performs a bitwise AND operation with the wake-up pin’s bit mask (WK_ PIN). This 
operation isolates the state of PAO. The function then compares this result to the bit mask itself to 
determine whether the pin is high. If PAO is high, the function returns true; otherwise, it returns false: 


static void set_power_mode(uint32_t pwr_mode) 


{ 


MODIFY _REG(PWR->CR, (PWR _CR_PDDS | PWR_CR_LPDS | PWR_CR_FPDS | PWR_ 
CR_LPLVDS | PWR_CR_MRLVDS), pwr mode); 


} 


This function configures the power mode of the STM32F4 microcontroller by modifying specific bits 
in PWR_CR. This function takes a parameter, pwr_mode, which specifies the desired power mode 
settings. It uses the MODIFY_REG macro to update the PWR_CR register, specifically targeting the 
bits related to different power modes such as PDDS (Power Down Deepsleep), LPDS (Low-Power 
Deepsleep), FPDS (Flash Power Down in Stop Mode), LPLVDS (Low-Power Regulator in Low 
Voltage in Deepsleep), and MRLVDS (Main Regulator in Low Voltage in Deepsleep). 


Next, we will populate the standby _ mode.h file: 


#ifndef STANDBY MODE H _ 
#define STANDBY MODE H _ 


#include <stdint.h> 
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#include "stm32f4xx.h" 


uint32 t get wakeup pin state(void) ; 
void wakeup pin init (void); 


void standby wakeup pin setup (void); 
#endif 


We are now ready to test inside main . c. Update your main . file, as shown here: 


#include <stdio.h> 
#include <string.h> 
#include "standby mode.h" 
#include "gpio exti.h" 


#include "uart.h" 
PLMes) ie G loyein_jeracysys) 7 
static void check_reset_source (void); 


int main (void) 


{ 
Weusie_alimalic: (()) F 
wakeup pin init(); 


/*Find reset source*/ 
ehechiresciems Ounce cue)y, 


/*Initialize EXTI*/ 
joyelie) tevaeat Sallie ()) § 


while (1) 


} 


The main function starts by initializing the UART for serial communication with uart_init, 
ensuring that we can send debugging information to the serial port. Next, it configures the wake-up pin 
by calling wakeup_pin_init, preparing the microcontroller to respond to external wake-up signals. 
The check_reset_ source function is then called to determine the cause of the microcontroller’s 
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reset, whether from standby mode or another source, and to handle any necessary flag clearing. 
Following this, the external interrupt (EXTI) for PC13 is initialized with pc13_exti_init. The 
function then enters an infinite loop, while (1), maintaining the program’s operational state and 
waiting for interrupts or events to occur: 


static void check reset source (void) 


{ 
/*Enable clock access to PWR*/ 
RCC->APB1ENR l= RCC_APB1ENR_PWREN; 
if ((PWR->CSR & PWR_CSR_SBF) == (PWR_CSR_SBF) ) 
{ 
/*Clear Standby flag*/ 
PWR->CR |= PWR_CR_CSBF; 
printf ("System resume from Standby..... Vin ie") 5 
/*Wait for wakeup pin to be released*/ 
while (get wakeup pin state() == 0) {} 
} 
/*Check and Clear Wakeup flag*/ 
if ((PWR->CSR & PWR_CSR_WUF) == PWR_CSR_WUF ) 
{ 
PWR->CR |= PWR_CR_CWUF; 
} 
} 


This function determines the cause of the microcontroller’s reset and handles the necessary flags 
accordingly. It begins by enabling the clock for the power control (PWR) peripheral to ensure access 
to the power control and status registers. It then checks whether the standby flag (SBF) is set in the 
PWR_CSR register, which indicates that the system has resumed from standby mode. If the flag is set, it 
clears the SBF and prints a message, indicating that the system has resumed from standby. The function 
also waits for the wake-up pin to be released, ensuring that the pin is in a stable state. Additionally, 
it checks whether the Wakeup flag (WUF) is set and, if so, clears the flag to reset the wake-up status. 
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This is the interrupt callback function: 


static void exti_callback (void) 


{ 


standby wakeup pin setup() ; 


} 


And finally, we have the interrupt handler: 


void EXTI15 10 IRQHandler(void) { 


if ((EXTI->PR & LINE13) !=0) 

{ 
/*Clear PR flag*/ 
EXTI->PR |=LINE13; 
//Do something... 
exti_callback(); 

} 


} 


The exti_callback function, coupled with EXTI15_10_IRQHandler, ensures that the 
microcontroller properly handles external interrupts from the wake-up pin. The ext i_callback 
function is a straightforward handler that calls standby wakeup _pin_setup. The EXTI15 10_ 
IRQHandler function is an interrupt service routine specifically for EXT] lines 15 to 10. It checks 
whether the interrupt was triggered by line 13 (associated with the wake-up pin), and if so, it clears the 
interrupt pending flag to acknowledge the interrupt. After clearing the flag, it calls ext i_callback 
to handle the wake-up event. 


Now, let's test the project! 


To test the project, start by pressing the blue push button to enter standby mode. Remember that PAO 
is configured as the wake-up pin and is active low. In normal mode, connect a jumper wire from PAO 
to the ground. To trigger a wake-up event, pull out the jumper wire and connect it to 3.3V, causing a 
change in logic that will wake the microcontroller from standby mode. 


To test on the microcontroller, simply build the project and run it. 


Open RealTerm and configure the appropriate port and baud rate to view the printed message that 
confirms the system has resumed from standby mode. 
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Summary 


In this chapter, we delved into the critical aspects of power management and energy efficiency in embedded 
systems. Efficient power management is essential for prolonging battery life and ensuring optimal 
performance in embedded devices. We began by exploring various power management techniques, 
laying the foundation for understanding how to reduce power consumption in embedded systems. 


We then examined the different low-power modes available in STM32F4 microcontrollers, providing 
detailed insights into their configurations and applications. Then, we discussed the wake-up sources 
and triggers in STM32F4, which are essential to ensure that a microcontroller can promptly come 
out of low-power modes. 


Finally, we put theory into practice by developing a driver to enter standby mode and wake up 
the microcontroller. 


With this journey into bare-metal embedded C programming now complete, it’s important to acknowledge 
the profound expertise you've gained. By mastering the nuances of microcontroller architecture and 
the discipline of register-level programming, you've equipped yourself with the tools to create efficient 
and reliable embedded systems from the ground up. This book was designed to offer more than just 
technical instruction; it also aimed to instill a deeper understanding of the hardware and a methodical 
approach to firmware development. As you move forward, remember that true mastery in this field 
lies in the continuous application and refinement of these principles. 
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